Retinoic acid receptor beta-2, its agonists, and gene theraphy vectors for the treatment of neurological disorders

ABSTRACT

The present invention relates to the use of RARβ2 and/or an agonist thereof in the preparation of a medicament to cause neurite development.

FIELD OF THE INVENTION

[0001] The present invention relates to a factor relating to neurite growth. Furthermore, the invention relates to vectors capable of directing the expression of a factor relating to neurite growth.

BACKGROUND TO THE INVENTION

[0002] The human peripheral and central nervous system consists of terminally differentiated cells which are not capable of directing neurite outgrowth or neurite regeneration.

[0003] It is desirable to cause neurite development, such as neurite outgrowth and/or neurite regeneration, for example in cases of nervous injuries such as spinal cord injuries or in diseases such as diabetes or neuropathies.

[0004] Nerve growth factor (NGF) is known to stimulate certain events such as neurite outgrowth. However, NGF is a relatively large molecule with a correspondingly high molecular weight. Moreover, NGF is susceptible to protease mediated degradation. Due to these and other considerations, NGF is difficult to administer. NGF is also relatively expensive to prepare. These are problems associated with the prior art.

SUMMARY OF THE INVENTION

[0005] We have surprisingly found that it is possible to cause neurite development, such as neurite outgrowth and/or neurite regeneration, by using retinoic acid receptor β2 (RARβ2) and/or an agonist thereof. Moreover, it is surprisingly shown that RARβ2 can be delivered to non-dividing mammalian cells using vectors according to the invention.

SUMMARY ASPECTS OF THE PRESENT INVENTION

[0006] The present invention is based on the surprising finding that it is possible to cause neurite development, such as neurite outgrowth and/or neurite regeneration, by using RARβ2 and/or an agonist thereof, and that RARβ2 may be introduced into neuronal cells using retroviral vectors based on lentiviral vectors.

[0007] Aspects of the present invention utilise this finding. For example it is possible to have a method that causes modulation of neurite development, such as neurite outgrowth and/or neurite regeneration, by using RARβ2 and/or a vector comprising same and/or an agonist thereof as explained herein.

DETAILED ASPECTS OF THE PRESENT INVENTION

[0008] In one aspect, the present invention relates to a viral vector comprising a nucleic acid sequence encoding a receptor.

[0009] The viral vector may be based on or derived from a DNA virus, or an RNA virus (a retrovirus). Examples of such viral vectors include but are not limited to herpes viruses, adenoviruses, adeno-associated viruses, retroviruses, lentiviruses and other viruses. This is discussed in more detail below.

[0010] The receptor may be any eukaryotic receptor, such as a vertebrate receptor. Examples of such receptors include but are not limited to mammalian receptors, primate receptors and human receptors. This is explained more fully in the following section(s).

[0011] In another aspect, the present invention relates to a retroviral vector derived from a lentivirus genome comprising a nucleic acid sequence capable of directing the expression of a receptor.

[0012] In another aspect, the present invention relates to a viral vector comprising a nucleic acid sequence encoding the retinoic acid receptor β2 (RARβ2).

[0013] In another aspect, the present invention relates to a retroviral vector derived from a lentivirus genome comprising a nucleic acid sequence capable of directing the expression of the retinoic acid receptor β2 (RARβ2).

[0014] In another aspect, the present invention relates to a gene therapy vector comprising a nucleic acid sequence encoding a retinoic acid receptor β2. In a preferred aspect, delivery of the nucleic acid encoding the retinoic acid receptor β2 enables neurite growth.

[0015] In another aspect, the present invention relates to the use of a vector as described herein in the preparation of a medicament to cause neurite development.

[0016] In another aspect, the present invention relates to the use of a vector as described herein in the preparation of a medicament for the treatment of a neurological disorder.

[0017] In another aspect, the present invention relates to a method of treating a neurological disorder comprising administering a vector as described herein to a subject.

[0018] In another aspect, the present invention relates to a host cell when transduced by a vector as described herein.

[0019] In another aspect, the present invention relates to a pharmaceutical composition comprising a vector as described herein in admixture with a pharmaceutically acceptable carrier, diluent or excipient; wherein the pharmaceutical composition is for use to cause neurite development.

[0020] In another aspect, the present invention relates to the use of RARβ2 and/or an agonist thereof in the preparation of a medicament to cause neurite development.

[0021] The term ‘RARβ2’ as used herein may refer to the polypeptide translation product of the RARβ2 gene open reading frame (ORF), that is to say the actual receptor itself, or may refer to the nucleic acid ORF encoding said polypeptide, or may even occasionally refer to the RARβ2 gene itself. It will be apparent to the reader which of these entities, or combination of said entities, is referred to by the term ‘RARβ2’ from the particular context in which such term is used.

[0022] In the present invention the RARβ2 and/or an agonist can be termed a pharmaceutically active agent.

[0023] Neurites are well known structures which develop from various neuronal cell types. They appear as microscopic branch or comb-like structures or morphological projections from the surface of the cell from which they emanate. Examples of neurite outgrowth are shown in the accompanying figures, and in publications such as those referenced in (Maden 1998-review article), and are well known in the art.

[0024] The RARβ2 coding sequence (i.e. the RARβ2 gene) is used as described hereinbelow. The RARβ2 gene may be prepared by use of recombinant DNA techniques and/or by synthetic techniques. For example, it may be prepared using the PCR amplified gene fragment prepared as in the Examples section of this document using the primers etc. detailed therein, or it may be prepared according to any other suitable method known in the art.

[0025] In another aspect, the present invention relates to the use of RARβ2 and/or an agonist thereof in the preparation of a medicament to cause neurite development, wherein said agonist is retinoic acid (RA) and/or CD2019.

[0026] Retinoic acid is commercially available. CD2019 is a polycyclic heterocarbyl molecule which is a RARβ2 agonist having the structure as discussed herein and as shown in (Elmazar et al., (1996) Teratology vol. 53 pp158-167).

[0027] In another aspect, the present invention relates to the use of RARβ2 and/or an agonist thereof in the preparation of a medicament for the treatment of a neurological disorder.

[0028] In another aspect, the present invention relates to the use of RARβ2 and/or an agonist thereof in the preparation of a medicament for the treatment of a neurological disorder, wherein said neurological disorder comprises neurological injury.

[0029] In another aspect, the present invention relates to a method of treating a neurological disorder comprising administering a pharmacologically active amount of an RARβ2 receptor, and/or an agonist thereof.

[0030] In another aspect, the present invention relates to a method of treating a neurological disorder comprising administering a pharmacologically active amount of an RARβ2 receptor, and/or an agonist thereof, wherein said agonist is RA and/or CD2019.

[0031] In another aspect, the present invention relates to a method of treating a neurological disorder comprising administering a pharmacologically active amount of an RARβ2 receptor, and/or an agonist thereof, wherein said RARβ2 receptor is administered by an entity comprising a RARβ2 expression system.

[0032] In another aspect, the present invention relates to a method of causing neurite development in a subject, said method comprising providing a nucleic acid construct capable of directing the expression of at least part of a RARβ2 receptor, introducing said construct into one or more cells of said subject, and optionally administering a RARβ2 agonist, such as RA and/or CD2019, to said subject.

[0033] In a further aspect, the invention relates to an assay method for determining whether an agent is capable of modulating RARβ2 signalling, said method comprising providing neural cells, contacting said cells with said agent, and assessing the activity of the RARβ2 receptor, such as through the monitoring of neurite outgrowth.

[0034] Neural cells for use in the assay method of the invention may be any suitable neural cell line, whether stably maintained in culture, or primary cells derived from an animal directly. Preferably said cells will be embryonic mouse dorsal root ganglion (DRG) cells prepared as described hereinbelow.

[0035] In a further aspect, the invention relates to a process comprising the steps of (i) performing the assay for modulation of RARβ2 signalling described above, (ii) identifying one or more agents that are capable of modulating said RARβ2 signalling, and (iii) preparing a quantity of those one or more identified agents.

[0036] In a further aspect, the invention relates to a process comprising the steps of (i) performing the assay for modulation of RARβ2 signalling described above, (ii) identifying one or more agents that are capable of modulating said RARβ2 signalling, (iii) preparing a quantity of those one or more identified agents, and (iv) preparing a pharmaceutical composition comprising those one or more identified agents.

[0037] In a further aspect, the invention relates to a method of affecting the in vivo activity of RARβ2 with an agent, wherein the agent is capable of modulating RARβ2 signalling, for example capable of modulating RARβ2 signalling in an in vitro assay method as described above.

[0038] In a further aspect, the invention relates to the use of an agent in the preparation of a pharmaceutical composition for the treatment of a neurological disorder or injury, wherein the agent is capable of modulating RARβ2 signalling, for example capable of modulating RARβ2 signalling in an in vitro assay method as described above.

[0039] In a further aspect, the invention relates to a method of treating a subject with an agent, wherein the agent is capable of modulating RARβ2 signalling, for example capable of modulating RARβ2 signalling in an in vitro assay method as described above.

[0040] In a further aspect, the invention relates to a pharmaceutical composition comprising RARβ2 and/or an agonist thereof in admixture with a pharmaceutically acceptable carrier, diluent or excipient; wherein the pharmaceutical composition is for use to cause neurite development.

[0041] In a further aspect of the invention, there is provided a viral vector genome comprising nucleic acid sequence(s) capable of directing the expresion of a receptor. Preferably said vector genome comprises nucleic acid sequence(s) capable of directing the expression of at least part of the RARβ2 receptor.

[0042] In a further aspect of the invention, there is provided a retroviral vector genome comprising nucleic acid sequence(s) capable of directing the expresion of at least part of RARβ2, said genome containing a deleted gag gene from a lentivirus wherein the deletion in gag removes one or more nucleotides downstream of nucleotide 350 of the gag coding sequence. Preferably the deletion extends from nucleotide 350 to at least the C-terminus of the gagpol coding region. More preferably the deletion additionally removes nucleotide 300 of the gag coding region and most preferably the deletion retains only the first 150 nucleotides of the gag coding region. However even larger deletions of gag can also be used, for example the gag coding region may contain only the first 109 nucleotides of the gag coding region. It may also be possible for the gag coding region to contain only the first 2 nucleotides of the gag coding region. Preferably, said vector genome is capable of directing the expression of substantially all of the RARβ2 polypeptide.

[0043] Preferably, the vector of the present invention is based on or derived from a lentivirus. More preferably, the vector of the present invention is based on or derived from a non-primate lentivirus. In a highly preferred embodiment, the vector of the present invention is based on or derived from a non-primate lentivirus such as equine infectious anaemia virus (EIAV). This is discussed in more detail below.

[0044] Additional features of the lentiviral genome are included in the vector genome which are necessary for transduction of the target cell such as reverse transcription and integration. These are, at least, a portion of an LTR containing sequence from the R-region and U5 region, sequences adjacent to the 3′LTR which contain a polypurine tract (PPT) and a 3′LTR from the lentivirus or a hybrid LTR containing sequences from the lentivirus and other elements. Optionally, the retroviral genome may contain accessory genes derived from a retrovirus, such as, but not limited to, a rev gene, a tat gene, a vif gene, a nef gene, a vpr gene or an S2 gene. Additional components may be added such as introns, splice-donor sites, a rev responsive element (RRE), sequences called the cPPT containing the polymerase region (Stetor S R, Rausch J W, Guo M J, Burnham J P, Boone L R, Waring M J, Le Grice S F ‘Characterization of (+) strand initiation and termination sequences located at the center of the equine infectious anemia virus genome.’ Biochemistry. Mar. 23, 1999;38(12):3656-67), cloning sites and selectable marker genes.

[0045] Moreover, it has been demonstrated (eg. see WO 99/32646) that a lentivirus minimal vector system can be constructed which requires neither S2, Tat, env nor dUTPase for either vector production or for transduction of dividing and non-dividing cells. A lentivirus minimal vector system can also be constructed which requires neither S2, Tat, env, rev nor dUTPase for either vector production or for transduction of dividing and non-dividing cells.

[0046] Thus according to another aspect the lentivirus genome from which the vector is derived lacks one or more accessory genes.

[0047] The deletion of accessory genes is highly advantageous. Firstly, it permits vectors to be produced without the genes normally associated with disease in lentiviral (e.g. HIV) infections. In particular, tat and nef are associated with disease. Secondly, the deletion of accessory genes permits the vector to package more heterologous DNA. Thirdly, genes whose function is unknown, such as dUTPase and S2, may be omitted, thus reducing the risk of causing undesired effects.

[0048] In addition, we have shown that the leader sequence of the lentivirus genome is essential for high protein expression.

[0049] Therefore in a further aspect the lentivirus genome from which the vector is derived lacks the tat gene but includes the leader sequence between the end of the 5′ LTR and the ATG of gag.

[0050] These data further define a minimal essential set of functional components for an optimal lentiviral vector. A vector is provided with maximal genetic capacity and high titre, but without accessory genes that are either of unknown function (S2, UTPase), and therefore may present risk, or are analogues of HIV proteins that may be associated with AIDS (tat, rev).

[0051] It will be appreciated that the present invention provides a retroviral vector derived from a lentivirus genome comprising nucleic acid sequence capable of directing the expression of at least part of RARβ2 and (1) comprising a deleted gag gene wherein the deletion in gag removes one or more nucleotides downstream of nucleotide 350 of the gag coding sequence; (2) wherein one or more accessory genes are absent from the lentivirus genome; (3) wherein the lentivirus genome lacks the tat gene but includes the leader sequence between the end of the 5′ LTR and the ATG of gag; and combinations of (1), (2) and (3). In a preferred embodiment the retroviral vector comprises all of features (1) and (2) and (3).

[0052] A “non-primate” vector, as used herein, refers to a vector derived from a virus which does not primarily infect primates, especially humans. Thus, non-primate virus vectors include vectors which infect non-primate mammals, such as dogs, sheep and horses, reptiles, birds and insects.

[0053] A lentiviral or lentivirus vector, as used herein, is a vector which comprises at least one component part derived from a lentivirus. Preferably, that component part is involved in the biological mechanisms by which the vector infects cells, expresses genes or is replicated.

[0054] The lentivirus may be any member of the family of lentiviridae. Preferably the lentivirus is one which does not naturally infect a primate (‘non-primate lentivirus’). Such viruses may include a feline immunodeficiency virus (FIV), a bovine immunodeficiency virus (BIV), a caprine arthritis encephalitis virus (CAEV), a Maedi visna virus (MW) or an equine infectious anaemia virus (EIAV). Preferably the lentivirus is an EIAV. Equine infectious anaemia virus infects all equidae resulting in plasma viremia and thrombocytopenia (Clabough, et al. 1991. J Virol. 65:6242-51). Virus replication is thought to be controlled by the process of maturation of monocytes into macrophages.

[0055] EIAV has the simplest genomic structure of the lentiviruses. In addition to the gag, pol and env genes EIAV encodes three other genes: tat, rev, and S2. Tat acts as a transcriptional activator of the viral LTR (Derse and Newbold 1993 Virology. 194:530-6; Maury, et al 1994 Virology. 200:63242.) and Rev regulates and coordinates the expression of viral genes through rev-response elements (RRE) (Martarano et al 1994 J Virol. 68:3102-11.). The mechanisms of action of these two proteins are thought to be broadly similar to the analogous mechanisms in the primate viruses (Martano et al ibid). The function of S2 is unknown. In addition, an EIAV protein, Ttm, has been identified that is encoded by the first exon of tat spliced to the env coding sequence at the start of the transmembrane protein.

[0056] In addition to protease, reverse transcriptase and integrase lentiviruses contain a fourth pol gene product which codes for a dUTPase. This may play a role in the ability of these lentiviruses to infect certain non-dividing cell types.

[0057] The viral RNA in aspect(s) of the invention is transcribed from a promoter, which may be of viral or non-viral origin, but which is capable of directing expression in a eukaryotic cell such as a mammalian cell. Optionally an enhancer is added, either upstream of the promoter or downstream. The RNA transcript is terminated at a polyadenylation site which may be the one provided in the lentiviral 3′ LTR or a different polyadenylation signal.

[0058] Thus the present invention provides a DNA transcription unit comprising a promoter and optionally an enhancer capable of directing expression of a retroviral vector genome.

[0059] Transcription units as described herein comprise regions of nucleic acid containing sequences capable of being transcribed. Thus, sequences encoding mRNA, tRNA and rRNA are included within this definition. The sequences may be in the sense or antisense orientation with respect to the promoter. Antisense constructs can be used to inhibit the expression of a gene in a cell according to well-known techniques. Nucleic acids may be, for example, ribonucleic acid (RNA) or deoxyribonucleic acid (DNA) or analogues thereof. Sequences encoding mRNA will optionally include some or all of 5′ and/or 3′ transcribed but untranslated flanking sequences naturally, or otherwise, associated with the translated coding sequence. It may optionally further include the associated transcriptional control sequences normally associated with the transcribed sequences, for example transcriptional stop signals, polyadenylation sites and downstream enhancer elements. Nucleic acids may comprise cDNA or genomic DNA (which may contain introns).

[0060] In another aspect, the present invention relates to a retroviral vector derived from a lentivirus genome comprising a nucleic acid sequence capable of directing the expression of at least part of RARβ2 and comprising a deleted gag gene wherein the deletion in gag removes one or more nucleotides downstream of nucleotide 350 of the gag coding sequence.

[0061] In another aspect, the present invention relates to a retroviral vector as described herein, wherein the deletion extends from nucleotide 350 to at least the C-terminus of the gag-pol coding region.

[0062] In another aspect, the present invention relates to a retroviral vector as described herein, wherein the deletion additionally removes nucleotide 300 of the gag coding region.

[0063] In another aspect, the present invention relates to a retroviral vector as described herein, wherein the deletion retains the first 150 nucleotides of the gag coding region. In another aspect, the present invention relates to a retroviral vector as described herein, wherein the deletion retains the first 109 nucleotides of the gag coding region.

[0064] In another aspect, the present invention relates to a retroviral vector as described herein, wherein the deletion retains only the first 2 nucleotides of the gag coding region.

[0065] In another aspect, the present invention relates to a retroviral vector as described herein, wherein the deletion is of the entire gag coding region.

[0066] In another aspect, the present invention relates to a retroviral vector derived from a lentivirus genome wherein one or more accessory genes are absent from the lentivirus genome.

[0067] In another aspect, the present invention relates to a retroviral vector as described herein, wherein the accessory genes are selected from dUTPase, S2, rev and tat.

[0068] In another aspect, the present invention relates to a retroviral vector derived from a lentivirus genome such as EIAV wherein the lentivirus genome lacks the tat gene but includes the leader sequences between the end of the 5′ LTR and the ATG of gag.

[0069] In another aspect, the present invention relates to a retroviral vector as described herein, which comprises at least one component from an equine lentivirus.

[0070] In another aspect, the present invention relates to a retroviral vector as described herein, wherein the equine lentivirus is EIAV.

[0071] In another aspect, the present invention relates to a retroviral vector as described herein, wherein the retroviral vector is substantially derived from EIAV.

[0072] In another aspect, the present invention relates to a method comprising transfecting or transducing a cell with a retroviral vector as described herein.

[0073] In another aspect, the present invention relates to a delivery system in the form of a retroviral vector as described herein.

[0074] In another aspect, the present invention relates to a cell transfected or transduced with a retroviral vector as described herein.

[0075] In another aspect, the present invention relates to use of a retroviral vector as described herein.

[0076] In another aspect, the present invention relates to use of a gene therapy vector as described herein.

[0077] In another aspect, the invention relates to the use of lentiviral gene therapy vectors for the delivery of retinoic acid receptor β2 to the peripheral and central nervous systems.

[0078] In another aspect, the present invention relates to a gene therapy vector comprising a nucleic acid sequence encoding a retinoic acid receptor β2. In a preferred aspect, delivery of the nucleic acid encoding the retinoic acid receptor β2 enables neurite growth.

[0079] In another aspect, the invention relates to EIAV gene therapy vectors configured to express retinoic acid receptor β2 (RARβ2).

[0080] In another aspect, the invention relates to methods for producing expression of RARβ2 in adult mammalian (such as human) spinal cord. Expression of RARβ2 in adult spinal cord is shown to stimulate neurite outgrowth and regeneration. Thus, in a preferred aspect, the invention relates to methods for stimulation of neurite outgrowth and/or regeneration in mammalian neuronal cells.

[0081] As used herein, the term ‘adult’ is used to mean non-foetal and/or non-embryonic. The term thus includes adults per se, as well as including young such as children and/or pups or other such infants. Thus, the term ‘adult’ as used herein may be understood to include any ‘post-natal’ ie. post-birth organism.

[0082] In another aspect, the invention relates to a differential expression screening method for identifying genes involved in a cellular process which method comprises comparing gene expression in: a first cell of interest; and a second cell of interest which cell comprises altered levels, relative to physiological levels, of a biological molecule due to the introduction into the second cell of a heterologous nucleic acid encoding at least part of RARβ2; and identifying gene products whose expression differs. Preferably, said heterologous nucleic acid encodes substantially all of RARβ2. Optionally, retinoic acid or an analogue thereof may also be present in the cellular environment of one or preferably both cells of interest. This method or a variant thereof may be advantageously applied to comparison of non-dividing neuronal cells with a different sample of the same cells which have been induced to exhibit neurite outgrowth, such as via transduction with a vector delivering RARβ2, or via other techniques discussed herein.

[0083] For ease of reference, these and further aspects of the present invention are now discussed under appropriate section headings. However, the teachings under each section are not necessarily limited to each particular section.

[0084] Preferable Aspects

[0085] In a preferred aspect, the administration of a nucleic acid construct capable of directing the expression of RARβ2 will be accompanied by the administration of a RARβ2 agonist such as RA, or preferably CD2019 (or a mimetic thereof.

[0086] Preferably said agonist will be to some degree selective for the RARβ2 receptor. Preferably said agonist will not significantly affect the RARα receptor. Preferably said agonist will not significantly affect the RARγ receptor. More preferably said agonist will not significantly affect the RARα receptor or the RARγ receptor. Even more preferably, said agonist will exhibit a high degree of selectivity for the RARβ2 receptor.

[0087] In a preferred aspect, the administration of a nucleic acid construct capable of directing the expression of RARβ2 will be accomplished using a vector, preferably a viral vector, more preferably a retroviral vector. In a highly preferred embodiment, the administration of a nucleic acid construct capable of directing the expression of RARβ2 will be accomplished using a retroviral vector capable of infecting non-dividing mammalian cells such as neural cells. This retroviral vector will preferably be derived from a lentiviral vector (preferably a non-primate lentiviral vector as discussed above), more preferably said vector will be derived from an equine infectious anaemia virus (EIAV). In a highly preferred aspect, said EIAV-derived vector will be a pseudotyped particle, such as VSV-G pseudotyped, or Rabies G pseudotyped.

[0088] Advantages

[0089] The present invention is advantageous because RARβ2 and/or an agonist thereof can cause modulation of neural cell development.

[0090] It is also an advantage of the present invention that administration of NGF to a subject is avoided.

[0091] It is also an advantage of the present invention that it enables neurite outgrowth to be promoted in adult neural tissue.

[0092] It is also an advantage of the present invention that it enables RARβ2 to be introduced into non-dividing mammalian cells such as neuronal cells.

[0093] It is also an advantage of the present invention that the receptor may be delivered to cells whose environment comprises endogenous levels of agonist of the receptor, such as retinoic acid (RA).

[0094] Retinoids

[0095] Retinoids are a family of molecules derived from vitamin A and include the biologically active metabolite, retinoic acid (RA). The cellular effects of RA are mediated through the action of two classes of receptors, the retinoic acid receptors (RARS) which are activated both by all-trans-RA (tRA) and 9-cis-RA (9-cis-RA), and the retinoid X receptors (RXRs), which are activated only by 9-cis-RA (Kastner et al., 1994; Kleiwer et al., 1994). The receptors are of three major subtypes, α, β and γ, of which there are multiple isoforms due to alternative splicing and differential promoter usage (Leid et al.). The RARs mediate gene expression by forming heterodimers with the RXRs, whilst the RXRs can mediate gene expression as homodimers or by forming heterodimers with a variety of orphan receptors (Mangelsdorf & Evans, 1995). Many studies on a variety of embryonic neuronal types have shown that RA can stimulate both neurite number and length (review, Maden, 1998), as, indeed, can the neurotrophins (Campenot, 1977; Lindsay, 1988; Tuttle and Mathew, 1995). The neurotrophins are a family of growth factors that are required for the survival of a variety of neurons of primary sensory neurons in the developing peripheral nervous system (Snider, 1994). One of the earliest genes induced by NGF in PC12 cells is the orphan receptor NGFI-B (NURR1) (Millbrandt, 1989). This suggests that the growth factor and retinoid mediated pathway in developing neurons can interact.

[0096] Background teachings on these aspects have been presented by Victor A. McKusick et al on http://www.ncbi.nim.nih.gov/Omim. The following information has been extracted from that source.

[0097] Three retinoic acid receptors, alpha, beta, and gamma, are members of the nuclear receptor superfamily. Retinoic acid was the first morphogen described in vertebrates. The RARA and RARB genes are more homologous to those of the 2 closely related thyroid hormone receptors THRA and THRB, located on chromosomes 17 and 3, respectively, than to any other members of the nuclear receptor family. These observations suggest that the thyroid hormone and retinoic acid receptors evolved by gene, and possibly chromosome, duplications from a common ancestor which itself diverged rather early in evolution from the common ancestor of the steroid receptor group of the family. The RARB gene, formerly symbolized HAP, maps to 3p24 by somatic cell hybridization and in situ hybridization.

[0098] Benbrook et al. (1988) showed a predominant distribution in epithelial tissues and therefore used the designation RAR(epsilon). By in situ hybridization, Mattei et al. (1988) assigned the RARB gene to 3p24. Using deletion mapping, de The et al. (1990) identified a 27-bp fragment located 59-bp upstream of the transcriptional start, which confers retinoic acid responsiveness on the herpesvirus thymidine kinase promoter. They found indications that both alpha and beta receptors act through the same DNA sequence. Mattei et al. (1991) assigned the corresponding gene to chromosome 14, band A, in the mouse, and to chromosome 15 in the rat.

[0099] Nadeau et al. (1992) confirmed assignment of the mouse homolog to the centromeric portion of chromosome 14.

[0100] From a comparison of a hepatitis-B virus (HBV) integration site present in a particular human hepatocellular carcinoma (HCC) with the corresponding unoccupied site in the nontumorous tissue of the same liver, Dejean et al. (1986) found that HBV integration placed the viral sequence next to a liver cell sequence that bears a striking resemblance to both an oncogene, ERBA, and the supposed DNA-binding domain of the human glucocorticoid receptor and estrogen receptor genes.

[0101] Dejean et al. (1986) suggested that this gene, usually silent or transcribed at a very low level in normal hepatocytes, becomes inappropriately expressed as a consequence of HBV integration, thus contributing to the cell transformation.

[0102] By means of a panel of rodent-human somatic cell hybrid DNAs, Dejean et al. (1986) localized the gene to chromosome 3. Further studies by de The et al. (1987) suggested that the HAP gene product may be a novel ligand-responsive regulatory protein whose inappropriate expression in liver is related to hepatocellular carcinogenesis. Brand et al. (1988) showed that the novel protein called HAP (for HBV-activated protein) is a retinoic acid receptor. They referred to this receptor as the beta type (RARB) and mapped it to 3p25-p21.

[0103] Lotan et al. (1995) found that the expression of RARB mRNA is selectively lost in premalignant oral lesions and can be restored by treatment with isotretinoin. Restoration of the expression of RARB mRNA was associated with a clinical response.

[0104] RARB, RARG, RXRB, and RXRG are expressed in the striatum. To study the effect of these genes on locomotion, Kreczel et al. (1998) developed single and double knockout mice and analyzed their locomotor skills by open field and rotarod testing. RARB-RXRB, RARB-RXRG, and RXRB-RXRG double null mutant mice, but not the corresponding single null mutants, exhibited reductions in forward locomotion when compared with wildtype littermates. Forty percent of the RARB-RXRB null mutants showed backward locomotion. Rotarod test performance was impaired for RARB, RARB-RXRB, RARB-RXRG, and RXRB-RXRG mice. In contrast, RARA, RARG, RARA-RXRG, and RARG-RXRG null mice showed no defects in locomotion, even though both RARA and RARG are also expressed in the striatum. The morphology, development, and function of skeletal muscle, peripheral nerves, and spinal cord were normal in all single and double null mutants, as were balance reflexes. These results suggested to Kreczel et al. (1998) that RARB, RXRB, and RXRG are involved specifically in the control of locomotor behaviors, and that heterodimers of RARB with either RXRB or RXRG are the functional receptor units, such that RXRB and RXRG are functionally redundant.

[0105] Kreczel et al. (1998)studied the expression of D1 and D2 dopamine receptors (D1R and D2R), the most abundant dopamine receptors in the striatum, in these mutant mice. RARB-RXRB, RARB-RXRG, and RXRB-RXRG double null mutants, but not RARB or RXRG single mutants, exhibited 40% and 30% reduction in whole-striatal DIR and D2R transcripts, respectively, when compared with wildtype controls.

[0106] The reduction was mostly in the medioventral regions of the striatum, including the shell and core of the nucleus accumbens, and the mediodorsal part of the caudate putamen. The reduction was not due to loss of D2R-expressing neurons; no increase in apoptosis was noted. The histology of the striatum was normal.

[0107] The characterization of a retinoic acid response element in the D2R promoter by Samad et al. (1997) led Kreczel et al. (1998) to suggest that the reduction in D2R and D2R expression occurs on a transcriptional level. The RARB-RXRB, RARB-RXRG, and RXRB-RXRG double null mutants did not exhibit the normal increase in locomotion induced by cocaine, mimicking the phenotype of D1R-null mice.

[0108] Taken together, these results indicated to Kreczel et al. (1998) that retinoids are involved in controlling the function of the dopaminergic mesolimbic pathway and suggested that defects in retinoic acid signaling may contribute to neurological disorders.

[0109] AGONISTS

[0110] The agonist of the present invention may be any suitable RARβ2 agonist. Preferably, said agonist of RARβ2 is capable of activating RARβ2 in a transactivation assay.

[0111] The agonist may be an organic compound or other chemical. The agonist can be an amino acid sequence or a chemical derivative thereof, or a combination thereof. The agent may even be a nucleotide sequence—which may be a sense sequence or an anti-sense sequence. The agent may even be an antibody.

[0112] Typically, the agonist will be an organic compound. Typically the organic compound will comprise two or more hydrocarbyl groups. Here, the term “hydrocarbyl group” means a group comprising at least C and H and may optionally comprise one or more other suitable substituents. Examples of such substituents may include halo-, alkoxy-, nitro-, an alkyl group, a cyclic group etc. In addition to the possibility of the substituents being a cyclic group, a combination of substituents may form a cyclic group. If the hydrocarbyl group comprises more than one C then those carbons need not necessarily be linked to each other. For example, at least two of the carbons may be linked via a suitable element or group. Thus, the hydrocarbyl group may contain hetero atoms. Suitable hetero atoms will be apparent to those skilled in the art and include, for instance, sulphur, nitrogen and oxygen. For some applications, preferably the agent comprises at least one cyclic group. The cyclic group may be a polycyclic group, such as a non-fused polycyclic group. For some applications, the agonist comprises at least the one of said cyclic groups linked to another hydrocarbyl group.

[0113] Specific Agonists

[0114] An example of a specific agonist according to the present invention is retinoic acid (RA). Both common forms of retinoic acid (either all-trans retinoic acid (tRA), or 9-cis-RA) are agonists of RARβ2.

[0115] CD2019 is a RARβ2 agonist having the structure as discussed herein and as shown in (Elmazar et al., (1996) Teratology vol. 53 pp158-167). This and other agonists are also discussed in (Beard and Chandraratna p.194; Johnson et al., 1996). The structure of CD2019 is presented as Formula I in the attached figures.

[0116] An alternative RARβ2 agonist is presented as Formula II in the attached figures.

[0117] The present invention also encompasses mimetics or bioisosteres of the formulae of Formula I and/or Formula II.

[0118] Preferably the agonist useful according to the present invention is selective for RARβ2.

[0119] Assay to Determine RARβ2 Agonism

[0120] Examples of agonists according to the present invention may be identified and/or verified by using an assay to determine RARβ2 agonism.

[0121] Hence, the present invention also encompasses (i) detemining if a candidate agent is capable of acting as a RARβ2 agonist; (ii) if said candidate agent is capable of acting as a RARβ2 agonist then delivering said agent to a subject and in such an amount to cause neurite development.

[0122] Assay

[0123] Any one or more of appropriate targets—such as an amino acid sequence and/or nucleotide sequence—may be used for identifying an agent capable of modulating RARβ2 in any of a variety of drug screening techniques. The target employed in such a test may be free in solution, affixed to a solid support, borne on a cell surface, or located intracellularly. The abolition of target activity or the formation of binding complexes between the target and the agent being tested may be measured.

[0124] The assay of the present invention may be a screen, whereby a number of agents are tested. In one aspect, the assay method of the present invention is a high through put screen.

[0125] Techniques for drug screening may be based on the method described in Geysen, European Patent Application 84/03564, published on Sep. 13, 1984. In summary, large numbers of different small peptide test compounds are synthesized on a solid substrate, such as plastic pins or some other surface. The peptide test compounds are reacted with a suitable target or fragment thereof and washed. Bound entities are then detected—such as by appropriately adapting methods well known in the art. A purified target can also be coated directly onto plates for use in a drug screening techniques. Alternatively, non-neutralising antibodies can be used to capture the peptide and immobilise it on a solid support.

[0126] This invention also contemplates the use of competitive drug screening assays in which neutralising antibodies capable of binding a target specifically compete with a test compound for binding to a target.

[0127] Another technique for screening provides for high throughput screening (HTS) of agents having suitable binding affinity to the substances and is based upon the method described in detail in WO-A-84/03564.

[0128] It is expected that the assay methods of the present invention will be suitable for both small and large-scale screening of test compounds as well as in quantitative assays.

[0129] In one preferred aspect, the present invention relates to a method of identifying agents that selectively modulate RARβ2.

[0130] In a preferred aspect, the assay of the present invention utilises cells that display RARβ2 on their surface. These cells may be isolated from a subject possessing such cells. However, preferably, the cells are prepared by transfecting cells so that upon transfect those cells display on their surface RARβ2.

[0131] Another example of an assay that may be used is described in WO-A-9849271, which concerns an immortalised human terato-carcinoma CNS neuronal cell line, which is said to have a high level of neuronal differentiation and is useful in detecting compounds which bind to RARβ2.

[0132] In another aspect, the invention relates to the use of a vector capable of directing the expression of RARβ2 to produce cell(s) for use in agonist/antagonist assays. For example, in another aspect, the invention relates to an assay comprising neuronal cell(s), said cells comprising an EIAV-derived vector capable of directing the expression of RARβ2 in said cell(s).

[0133] Reporters

[0134] A wide variety of reporters may be used in the assay methods (as well as screens) of the present invention with preferred reporters providing conveniently detectable signals (eg. by spectroscopy). By way of example, a reporter gene may encode an enzyme which catalyses a reaction which alters light absorption properties.

[0135] Other protocols include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA) and fluorescent activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilising monoclonal antibodies reactive to two non-interfering epitopes may even be used. These and other assays are described, among other places, in Hampton R et al (1990, Serological Methods, A Laboratory Manual, APS Press, St Paul Minn.) and Maddox D E et al (1983, J Exp Med 15 8:121 1).

[0136] Examples of reporter molecules include but are not limited to (galactosidase, invertase, green fluorescent protein, luciferase, chloramphenicol, acetyltransferase, (glucuronidase, exo-glucanase and glucoamylase. Alternatively, radiolabelled or fluorescent tag-labelled nucleotides can be incorporated into nascent transcripts which are then identified when bound to oligonucleotide probes.

[0137] By way of further examples, a number of companies such as Pharmacia Biotech (Piscataway, N.J.), Promega (Madison, Wis.), and US Biochemical Corp (Cleveland, Ohio) supply commercial kits and protocols for assay procedures. Suitable reporter molecules or labels include those radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic agents as well as substrates, cofactors, inhibitors, magnetic particles and the like. Patents teaching the use of such labels include U.S. Pat. No. 3,817,837; U.S. Pat. No. 3,850,752; U.S. Pat. No. 3,939,350; U.S. Pat. No. 3,996,345; U.S. Pat. No. 2,277,437; U.S. Pat. No. 4,275,149 and U.S. Pat. No. 4,366,241.

[0138] Differential Expression Screening Techniques

[0139] Genes encode gene products, mainly polypeptides but also RNAs, that are involved in a huge variety of cellular processes. The technique of differential expression screening is based on the idea that by comparing expression under two sets of conditions, genes whose expression varies between those two conditions can be identified and their function related back to the differences between those conditions. For example, genes involved in a pathway responsive to mitogens such as plate-derived growth factor (PDGF) can be identified by comparing gene expression in cells exposed to PDGF versus gene expression in cells not exposed to PDGF. Thus the term “differential expression screening” as used herein means comparing gene expression between two cells under different conditions or two different cells under the same or different conditions, with the aim of identifying gene products that differ in their levels of expression between the two cells.

[0140] The differences in gene expression may be measured using a variety of techniques. The first main type of technique is based on the measurement of nucleic acids and is termed herein as “genomic or cDNA techniques”. A useful review is provided in Kozian and Kirschbaum (1999). The second main type of technique is based on the measurement of cellular protein content and is termed herein as “proteomic techniques”.

[0141] Genomic or cDNA Techniques

[0142] One method well known in the art is subtractive cDNA hybridisation. This technique involves hybridising a population of mRNAs from one cell (e.g. a control cell) with a population of cDNAs made from the mRNA of another cell (e.g. a cell exposed to PDGF). This step will remove all sequences from the cDNA preparation that are common to both cells. The cDNAs derived from mRNAs whose expression is upregulated in the cell exposed to PDGF will not have a corresponding mRNA from the control with which to hybridise and can be isolated. Typically, the cDNAs are also hybridised with mRNA from the same cell to confirm that they represent coding sequences. This procedure is described in detail in WO90/11361 where mRNA from cells from the roots of plants treated with a chemical, N-(amincarbonyl)-2-chlorobenzenesulphonamide, were used to produce a cDNA library that was then hybridised with mRNA from untreated root cells. The procedure identified a number of genes whose expression was upregulated by the chemical.

[0143] The polymerase chain reaction (PCR) has led to the development of a number of other methods. RT-PCR differential display was first described by Liang and Pardee (1992). This technique involves the use of oligo-dT primers and random 5′ oligonucleotide 10-mers to carry out PCR on reverse-transcribed RNA from different cell populations. PCR is often carried out using a radiolabelled nucleotide so that the products can be visualised after gel electrophoresis and autoradiography. Wilkinson et al. (1995) used PCR differential display to identify five mRNAs that are upregulated in strawberry fruit during ripening. A review of differential display RT-PCR (also known as differential display of mRNA) is provided in Zhang et a/. (1998) and a recent improvement using ‘long distance’ PCR is described in Zhao et al. (1999).

[0144] Another technique is termed cDNA library screening. A review of this technique and the other two differential expression screening techniques mentioned above is provided in Maser and Calvet (1995).

[0145] Differential display competitive PCR is a fairly recent innovation that has been successfully used to study changes in global gene expression in situations where only a few genes change expression levels, such as exposure of MCF17 cell to oestradiol, and in more complex situations such as neuronal differentiation of human NTERA2 cells (Jorgensen et al., 1999).

[0146] A further PCR based technique is representational difference analysis (RDA)—see Kozian and Kirschbaum (1999) for review and references therein. Also reviewed in see Kozian and Kirschbaum (1999) is a technique termed serial analysis of gene expression.

[0147] The actual identification of gene products whose expression differs between the two cell populations can be carried out in a number of ways. Subtractive methods will inherently identify gene products whose expression differs since gene products whose expression is the same are eliminated from the sample. Other methods include simply comparing the expression products from one cell with the expression products from another and looking for any differences (with PCR-based techniques, the number of products in each sample can be limited to a reasonable size), optionally with the aid of a computer program. For example using a PCR-based technique a visual comparison of bands present in different lanes allows the identification of bands unique to one lane. These bands can be cut out of the gel and subsequently analysed.

[0148] The advent of DNA chip technology, allows comparisons to be conveniently conducted by the use of microarrays (see Kozian and Kirschbaum, 1999 for review and references therein). Typically, arrays are generated using cDNAs (including ESTs), PCR products, cloned DNA and synthetic oligonucleotides that are fixed to a substrate such as nylon filters, glass slides or silicon chips. To determine differences in gene expression, labelled cDNAs or PCR products are hybridised to the array and the hybridisation patterns compared. The use of fluorescently labelled probes allows two different cell populations to be applied simultaneously to one chip and the results measured at different wavelengths A microarray-based differential expression screening technique is described in U.S. Pat. No. 5,800,992.

[0149] Proteomic Techniques

[0150] Proteomics is the study of proteins properties on a large scale to obtain a global, integrated view of disease processes, cellular processes and networks at the protein level. A review of techniques used in proteomics is given in Blackstock and Weir (1999)—see also references provided therein. The methods of the present invention are mainly concerned with expression proteomics, the study of global changes in protein expression in cells using electrophoretic techniques and image analysis to resolve proteins. Whereas nucleic acid analysis emphasises the message, proteomics is more concerned with the product. The two approaches are sometimes complementary since proteomic techniques may be useful in detecting changes in polypeptide levels due to changes in protein stability rather than mRNA levels.

[0151] A well known and ubiquitous technique used in the field of proteomics involves measuring the polypeptide content of a cell using 2D polyacrylamide gel electrophoresis (PAGE) and comparing this with the polypeptide content of another cell. The results of electrophoresis are typically a gel visualised with a dye such as silver stain or Coomassie-blue, or an autoradiograph produced from the gel, all with spots corresponding to individual proteins. Fluorescent dyes are also available.

[0152] The aim is therefore to identify spots that differ between the two gels/autoradiographs, i.e. missing from one, reduced in intensity or increased in intensity. Thus in the case of proteomics, comparing gene expression simply involves comparing the protein profile from one cell with the protein profile from another. Commercial software packages are available for automated spot detection.

[0153] Spots of interest may be excised from gels and the proteins identified using techniques such as matrix-assisted-laser-desorption-ionisation-time-of-flight (MALDI-TOF) mass spectrometry and electrospray.

[0154] It may be desirable to perform some measure of prefractionation, such as centrifugation or free-flow electrophoresis to improve the identification of low abundance proteins. Special procedures have also been developed for basic proteins, membrane proteins and other poorly soluble proteins (Rabilloud et al., 1997).

[0155] The above discussion provides a description of prior art methods available to the skilled person for performing differential expression screening of two or more cell populations in a general sense. However, the present invention is distinguished from these prior art methods in that a further step is required, namely that the levels of an endogenous biological molecule in a cell are altered by the experimenter, so that the levels of gene products that are affected by the molecule become more responsive to cellular perturbations such as signalling events. In other words, the object is to amplify and/or increase the signal to noise ratio of the differential response normally obtained so as to increase the likelihood of detecting gene products whose levels in a cell are low and/or whose expression normally changes by only a small amount.

[0156] By way of an example, the transcription factor HIF-1α is responsive to intracellular oxygen levels. Decreases in oxygen levels increase HIF-1α activity and lead to increased transcription from genes comprising a hypoxia responsive element (HRE). If the levels of HIF-1α in the cell are raised artificially, for example by infecting cells with a viral vector that directs expression of HIF-1α, then you would expect to see an increase in the transcriptional response mediated by HIF-1α. Consequently, changes in the expression of genes whose expression is sensitive to the HIF-1α mediated hypoxic response should be greater than in normal cells expressing physiological levels of HIF-1α.

[0157] Biological Molecules

[0158] The biological molecule can be any compound that is found in cells as a result of anabolic or catabolic processes within a cell or as a result of uptake from the extracellular environment, by whatever means. The term “biological molecule” means that the molecule has activity in a biological sense. Preferably the biological molecule is synthesised within the cell, i.e. is endogenous to that cell, or in the case of multicellular organisms, also within any of the cells of the organism.

[0159] Examples of biological molecules will therefore include proteins, nucleic acids, carbohydrates, lipids, steroids, co-factors, prosthetic groups (such as haem), inorganic molecules, ions (such as Ca²⁺), inositides. Where appropriate, precursors, monomeric, oligomeric and polymeric forms, and breakdown products of the above are also included.

[0160] Example of polypeptides include enzymes, transcription factors, hormones, structural components of cells and receptors including membrane bound receptors.

[0161] Preferably, the biological molecule is known to be involved in the cellular process of interest.

[0162] In one embodiment of the invention, the biological molecule is responsive to a signal, which may be an externally applied signal such as an environmental signal, for example redox stress, the binding of an extracellular ligand to a cell surface receptor leading to a cellular response mediated by a signal transduction signal. Alternatively, the signal may be an internally applied signal such as an increase in kinase activity due to failing levels of a cell metabolite.

[0163] The levels of the biological molecule may be altered directly or indirectly. Direct alteration may be achieved by, for example, causing cells to take up the molecule by incubating cells in a medium containing higher than physiological levels of the molecule. Other methods include vesicle-mediated delivery and microinjection. In the case of nucleic acids and polypeptides, the level of the biological molecule in the cell may be raised by the introduction of a heterologous nucleic acid into the cell which directs the expression of the nucleic acid or polypeptide.

[0164] The term “heterologous nucleic acid” in the present context means that the nucleic acid is not present in its natural context i.e. the cell has been modified so as to contain the nucleic acid which would otherwise not be present in the form in which it is introduced. For example, the nucleic acid may be extrachromosomal. The nucleic acid may also be integrated into the genome by viral transduction or homologous recombination. Nonetheless, part of all of the heterologous nucleic may be identical to a corresponding genomic sequence since the introduction of additional copies of a gene is a convenient means for increasing the levels of expression of that gene.

[0165] Indirect means for altering the levels of the biological molecule are numerous and include increasing the levels of an inhibitory or stimulatory molecule using the methods described above. Inhibitory molecules include antisense nucleic acids, ribozyme or an EGS directed against the mRNA encoding the biological molecule, a transdominant negative mutant directed against the biological molecule, transcription factors, enzyme inhibitors, and intracellular antibodies such as scFvs. Stimulatory molecules include enzyme activators, transcriptional activators. Thus cells may be manipulated in a number of ways such that ultimately the levels of the biological molecule are altered. Reduced expression may be achieved by expressing an anti-sense RNA,

[0166] The levels of the biological molecule are altered relative to physiological levels. Thus they may be enhanced or reduced. The term “relative to physiological levels” means relative to the concentration or activity of the biological molecule typically present in the cell under normal physiological conditions prior to manipulation of those levels. Thus the intention is that by deliberate means, the activity of the biological molecule is altered above or below that which is found in the cell under a range of normal physiological conditions. “Physiological conditions” includes the conditions normally found in vivo and the conditions normally used in vitro to culture the cells.

[0167] By way of an example, the activity or concentration may be increase or decreased 5-fold, 10-fold, 20-fold, 50-fold or 100-fold compared to the normal physiological activity or concentration found in the cell prior to introducing, for example, the heterologous nucleic acid.

[0168] Where, as in a preferred embodiment of the invention, the levels of the biological molecule are altered by the introduction of a heterologous nucleic acid, typically a nucleic acid that directs expression of a polypeptide, the heterologous nucleic acid will comprise a coding sequences operably linked to a control sequence that is capable of providing for the expression of the coding sequence by the host cell, i.e. the vector is an expression vector. The term “operably linked” means that the components described are in a relationship permitting them to function in their intended manner. A regulatory sequence “operably linked” to a coding sequence is ligated in such a way that expression of the coding sequence is achieved under condition compatible with the control sequences.

[0169] The control sequences may be modified, for example by the addition of further transcriptional regulatory elements to make the level of transcription directed by the control sequences more responsive to transcriptional modulators.

[0170] Control sequences operably linked to sequences encoding the protein of the invention include promoters/enhancers and other expression regulation signals. These control sequences may be selected to be compatible with the host cell in which the expression vector is designed to be used. The term promoter is well known in the art and encompasses nucleic acid regions ranging in size and complexity from minimal promoters to promoters including upstream elements and enhancers.

[0171] The promoter is typically selected from promoters which are functional in mammalian, cells, although promoters functional in prokaryotic cells or other eukaryotic cells may be used where appropriate. Thus, the promoter is typically derived from promoter sequences of viral or eukaryotic genes. For example, it may be a promoter derived from the genome of a cell in which expression is to occur. Eukaryotic promoters, may be promoters that function in a ubiquitous manner (such as promoters of α-actin, β-actin, tubulin) or, alternatively, a tissue-specific manner (such as promoters of the genes for pyruvate kinase). Tissue-specific promoters specific for particular cells may be used. They may also be promoters that respond to specific stimuli, for example promoters that bind steroid hormone receptors. Viral promoters may also be used, for example the Moloney murine leukaemia virus long terminal repeat (MMLV LTR) promoter, the rous sarcoma virus (RSV) LTR promoter or the human cytomegalovirus (CMV) IE promoter.

[0172] It may be advantageous for the promoters to be inducible so that the levels of expression from the heterologous nucleic acid can be regulated during the life-time of the cell. Inducible means that the levels of expression obtained using the promoter can be regulated.

[0173] In addition, any of these promoters may be modified by the addition of further regulatory sequences, for example enhancer sequences. Chimeric promoters may also be used comprising sequence elements from two or more different promoters described above.

[0174] Suitable vectors include plasmids, artificial chromosomes and viral vectors. Viral vectors include DNA virus vectors, RNA virus vectors (ie. retroviral vectors), such as lentiviruses, adenoviral vectors, adeno-associated vectors and herpes simplex viral vectors. Vectors/polynucleotides may introduced into suitable host cells using a variety of techniques known in the art, such as transfection, transformation, electroporation, infection with recombinant viral vectors such as retroviruses, herpes simplex viruses and adenoviruses, direct injection of nucleic acids and biolistic transformation. It is particularly preferred to use recombinant viral vector-mediated techniques.

[0175] A cell of interest can be any cell, for example a prokaryotic cell, a yeast cell, a plant cell or an animal cell, such as an insect cell or a mammalian cell, including a human cell. In the case of cells from multicellular organism, cells may be primary cells or immortalised cell lines. Although cells are frequently referred to in the singular, in general cells will be part of a cell population.

[0176] In certain aspects of the invention, a comparison is required between gene expression in at least two distinct cells. Typically the first of the two or more cells is termed a reference cell. In a preferred embodiment, the cells to be used in the comparison are substantially identical in all respects. For example they may both be cells of the same cell line or obtained from the same tissue in an organism. One or both of the cells may then be manipulated so that they comprise altered levels, relative to physiological levels, of the biological molecule as described above. In one embodiment, the first cell is unaltered and the second cell altered. This is particularly preferred since it should result in an improved signal to noise ratio. In a highly preferred embodiment, the first cell is unaltered, and the second cell comprises RARβ2 according to the present invention. Preferably, the cells are mammalian neuronal cells.

[0177] Nonetheless, it is not necessary that the cells used as the starting point of the investigation be substantially identical. For example, in one aspect of the invention, genes involved in disease processes may be investigated using cells from a diseased organism, such as a mammalian patient. These may be compared with cells from a normal organism or similar cells from the same or a different diseased individual. Where cells from a normal organism and a diseased organism are used, generally the normal cells correspond to the first cell of interest and the diseased cells correspond to the second cell of interest. Consequently, at least the diseased cells are modified as described above in so that comprises altered levels of the biological molecule.

[0178] In another embodiment, one cell is a cell comprising a mutant gene whereas the other cell comprises a wild-type version of the same gene.

[0179] Another possibility is that the cells are from different tissues or from different stages in development or differentiation, for example as affected by the presence or absence of RARβ2, and/or retinoic acid or derivatives thereof.

[0180] The present invention provides a number of improved methods for identifying genes by differential expression screening techniques.

[0181] In another aspect, a method is provided for identifying genes involved in a cellular process. Essentially one of the cells is manipulated so that the levels within that cell of a biological molecule involved in the cellular process are altered. Preferably, this process is neurite outgrowth and/or neural regeneration as effected by the action of retinoic acid through RARβ2. Typically, this is achieved by the introduction of a heterologous nucleic acid into the cell to direct the expression of a polypeptide such as RARβ2. The polypeptide may be the same as the biological molecule or it may modulate the levels of the biological molecule as described above.

[0182] In general, simply modulating the levels of a biological molecule in one of two identical cells and then measuring gene transcription is not the aim of the methods of the present invention since you will be measuring the effect of the biological molecule on gene expression in the cells rather than using the change in the levels of the biological molecule to enhance or reduce the response to an event of interest.

[0183] However, where the biological molecule is a gene product, such as a polypeptide, that is produced naturally within the cell, altering the levels of the gene product by the introduction of a heterologous nucleic acid may be used to simultaneously both perturb a cellular process and enhance the response to such a perturbation making it easier to identify gene products involved in that cellular process using differential expression techniques. By way of an example, overexpression of HIF-1α not only induces an hypoxic response but amplifies the downstream elements of that response due an enhanced regulatory effect on HIF-1⊕ mediated transcription.

[0184] Nonetheless in the broader aspects of the present invention, two main possibilities arise. Firstly, the two cells are different and have inherently different gene expression patterns. In this situation, alterations in the levels of the biological molecule can be used to enhance those differences. The two cells may be, for example, from different tissues, or from different stages in development or differentiation. The two cells may also be different by virtue of one cell being from diseased tissue and the other cell from normal tissue. Other configurations envisaged are given above.

[0185] Secondly, the two cells are the same but one of the cells is stimulated in some manner and the other cell not (or one is stimulated to a greater extent than the other). For example, one cell is incubated in the presence of a growth factor and the other is not. The growth factor is therefore not the biological molecule but is instead a stimulus designed to perturb gene expression in the cell, the effects of which may be amplified by the biological molecule which in turn is altered in level by the polypeptide expressed from the heterologous nucleic acid.

[0186] Thus in a second aspect there is provided a method whereby genes whose expression is regulated by a signal are identified by subjecting two distinct cell populations to different levels of a signal, whereby either or both cells have been manipulated so as to alter the levels of a biological molecule whose activity is responsive to the signal, and identifying gene products whose expression differs. The term “whose activity is response to the signal” includes biological molecule whose concentration in the cell varies in response to the signal as well as biological molecules whose properties such as enzymatic activity or affinity for another cellular component varies in response to the signal.

[0187] Thus returning to our factor example, the cells that are exposed to the factor may have been altered to express increased levels of a transcription factor involved in the signal transduction cascade. Consequently, the effect of the growth factor will be increased downstream of the transcription factor (in either a negative or positive sense) making it easier to identify differentially expressed genes whose expression is regulated by the transcription factor and ultimately by the factor. Preferably the factor leads to stimulation of neural regeneration/neurite outgrowth via signalling through RARβ2.

[0188] The signal may be either physical, such as redox conditions, CO₂ levels, light or temperature, or chemical such as ligands that bind to receptors on the cell surface and trigger signal transduction pathways (including hormones or cell surface molecules normally attached to other cells), or substrates for enzyme reactions that diffuse into or are transported into the cell.

[0189] The first cell is subjected the signal at a first level and the second cell is subjected to the signal at a second level. The first level may simply be the absence of the signal and the second level may be the presence of the signal, or vice-versa. The levels of the signals may be adjusted so as to provide a discernible difference in gene expression but are preferably at physiologically relevant levels.

[0190] In another aspect of the present invention, knowledge already acquired about genes involved in a disease or other biological process may be used to generate further information about other genes whose expression is altered in a disease or other biological process. To do this, one cell is modified so that the levels of the gene product known to be involved in the disease or other biological process are altered, either directly by the introduction of a heterologous nucleic acid encoding the gene product, or indirectly as described above. Gene expression is then measured in both cells and the results compared to identify gene products whose expression varies.

[0191] In this aspect of the invention, the two cells may be identical, except for the change in the levels of the gene product known to be involved in the disease or other biological process of interest. The two cells may thus both be normal cells of the same type as a cell type in which the disease or other process manifests itself, or they may both be diseased cells. Alternatively, one cell may be normal and the other diseased. Preferably the diseased cell is the modified cell if only one of the cells is modified.

[0192] In another aspect of the invention, differential expression screening methods are used to identify genes involved in a disease or other process in a two stage procedure. Firstly, gene expression is compared between a first cell of interest, for example a cell from a normal patient, and a second cell of interest, for example corresponding cells from a diseased patient. As discussed above, the first cell and the second cell will be different in some aspect such that they have different expression patterns. This may be because the cells are from different tissues or different individuals (for example a normal patient and a diseased patient) or the cells may be of similar origin but have been treated differently in some respect.

[0193] Gene products whose expression differs between the first cell and the second cell are identified. Secondly, a third cell of interest, essentially identical to the first cell is used in a screening procedure where a candidate gene is introduced into the third cell so that levels of the genes are altered (typically raised). Gene expression in this cell is compared with gene expression in the first cell and gene products whose expression differs between the normal cell and the third cell comprising altered levels of the candidate gene are identified. If a gene product whose expression is altered in the second cell also has altered gene expression in the third cell, then the candidate gene is selected for further study. Preferably there is a correlation over two or more gene products, preferably at least four or five gene products to minimise false positives.

[0194] Clearly, the methods of the present invention may advantageously be applied to the differential analysis of non-dividing neuronal cells and a different sample of the same cells which have been induced to regenerate or undergo neurite outgrowth by the methods of the present invention. This differential analysis applies to the discovery and/or validation of candidate molecules, in particular those biological molecules which lie in the signalling pathway between the activation of the RARβ2 receptor and the actual morphological phenotype of neurite outgrowth. This phenomenon of neurite outgrowth/regeneration will be brought about by physiological changes within the cell which are initiated by the activation of RARβ2, and may include changes in gene expresison. Thus, by taking a sample of neuronal cells and introducing RARβ2 as described herein, and allowing retinoic acid to signal through this receptor, and comparing the pattern of gene expression with a sample of such cells which do not contain RARβ2/retinoic acid, key difference(s) in gene expression may be identified. The pathway(s) leading to neurite outgrowth will be switched on in the cells with RARβ2/retinoic acid. By making cDNA from these and from the non-activated cells in parallel, subtractive cDNA libraries may be made in order to isolate differences in gene expression between the two sets of cells. This or other differential screening technique(s), or proteomic techniques such as 2-D electrophoretic mapping, can be used to detect the stimulation and/or repression of particular gene(s) or sets of genes which the different conditions produce. These differentially expressed genes and/or their gene products are each individual candidate factors in the stimulation of neurite outgrowth, and it will be clearly understood that the invention relates also to these. This topic is discussed in more detail below.

[0195] Host Cells

[0196] Polynucleotides for use in the present invention—such as for use as targets or for expressing targets or for use as the pharmaceutically active agent—may be introduced into host cells.

[0197] The term “host cell”—in relation to the present invention includes any cell that could comprise the polynucleotide sequence of the present invention.

[0198] Here, polynucleotides may be introduced into prokaryotic cells or eukaryotic cells, for example yeast, insect or mammalian cells.

[0199] Polynucleotides of the invention may introduced into suitable host cells using a variety of techniques known in the art, such as transfection, transformation and electroporation. Where polynucleotides of the invention are to be administered to animals, several techniques are known in the art, for example infection with recombinant viral vectors such as retroviruses, herpes simplex viruses, adenoviruses, adeno-associated viruses, direct injection of nucleic acids and biolistic transformation. The selection of the particular technique for the administration of polynucleotides into particular host cell(s) is well within the abilities of a person skilled in the art and is further discussed herein. For example, a person wishing to administer polynucleotide to a non-dividing mammalian cell such as a neuronal cell would select a vector system capable or transfecting/transducing non-dividing mammalian cells. An example of such a vector is a viral vector such as a vector based on or derived from EIAV. This and further examples are discussed at length herein.

[0200] Thus, a further embodiment of the present invention provides host cells transformed or transfected with a polynucleotide that is or expresses the target of the present invention. Preferably said polynucleotide is carried in a vector for the replication and expression of polynucleotides that are to be the target or are to express the target. The cells will be chosen to be compatible with the said vector and may for example be prokaryotic (for example bacterial), fungal, yeast or plant cells.

[0201] The gram negative bacterium E. coli is widely used as a host for heterologous gene expression. However, large amounts of heterologous protein tend to accumulate inside the cell. Subsequent purification of the desired protein from the bulk of E. coli intracellular proteins can sometimes be difficult.

[0202] In contrast to E. coli, bacteria from the genus Bacillus are very suitable as heterologous hosts because of their capability to secrete proteins into the culture medium. Other bacteria suitable as hosts are those from the genera Streptomyces and Pseudomonas.

[0203] Depending on the nature of the polynucleotide encoding the polypeptide of the present invention, and/or the desirability for further processing of the expressed protein, eukaryotic hosts such as yeasts or other fungi may be preferred. In general, yeast cells are preferred over fungal cells because they are easier to manipulate. However, some proteins are either poorly secreted from the yeast cell, or in some cases are not processed properly (e.g. hyperglycosylation in yeast). In these instances, a different fungal host organism should be selected.

[0204] Examples of suitable expression hosts within the scope of the present invention are fungi such as Aspergillus species (such as those described in EP-A-0184438 and EP-A-0284603) and Trichoderma species; bacteria such as Bacillus species (such as those described in EP-A-0134048 and EP-A-0253455), Streptomyces species and Pseudomonas species; and yeasts such as Kluyveromyces species (such as those described in EP-A-0096430 and EP-A-0301670) and Saccharomyces species. By way of example, typical expression hosts may be selected from Aspergillus niger, Aspergillus niger var. tubigenis, Aspergillus niger var. awamori, Aspergillus aculeatis, Aspergillus nidulans, Aspergillus orvzae, Trichoderma reesei, Bacillus subtilis, Bacillus licheniformis, Bacillus amyloliquefaciens, Kluyveromyces lactis and Saccharomyces cerevisiae.

[0205] Polypeptides that are extensively modified may require correct processing to complete their function. In those instances, mammalian cell expression systems (such as HEK-293, CHO, HeLA) are required, and the polypeptides are expressed either intracellularly, on the cell membranes, or secreted in the culture media if preceded by an appropriate leader sequence.

[0206] The use of suitable host cells—such as yeast, fungal, plant and mammalian host cells—may provide for post-translational modifications (e.g. myristoylation, glycosylation, truncation, lipidation and tyrosine, serine or threonine phosphorylation) as may be needed to confer optimal biological activity on recombinant expression products of the present invention.

[0207] Organism

[0208] The term “organism” in relation to the present invention includes any organism that could comprise the sequence according to the present invention and/or products obtained therefrom. Examples of organisms may include a fungus, yeast or a plant.

[0209] The term “transgenic organism” in relation to the present invention includes any organism that comprises the target according to the present invention and/or products obtained.

[0210] Transformation of Host Cells/Host Organisms

[0211] As indicated earlier, the host organism can be a prokaryotic or a eukaryotic organism. Examples of suitable prokaryotic hosts include E. coli and Bacillus subtilis. Teachings on the transformation of prokaryotic hosts is well documented in the art, for example see Sambrook et al (Molecular Cloning; A Laboratory Manual, 2nd edition, 1989, Cold Spring Harbor Laboratory Press) and Ausubel et al., Current Protocols in Molecular Biology (1995), John Wiley & Sons, Inc.

[0212] If a prokaryotic host is used then the nucleotide sequence may need to be suitably modified before transformation—such as by removal of introns.

[0213] In another embodiment the transgenic organism can be a yeast. In this regard, yeast have also been widely used as a vehicle for heterologous gene expression. The species Saccharomyces cerevisiae has a long history of industrial use, including its use for heterologous gene expression. Expression of heterologous genes in Saccharomyces cerevisiae has been reviewed by Goodey et al (1987, Yeast Biotechnology, D R Berry et al, eds, pp 401-429, Allen and Unwin, London) and by King et al (1989, Molecular and Cell Biology of Yeasts, E F Walton and G T Yarronton, eds, pp 107-133, Blackie, Glasgow).

[0214] For several reasons Saccharomyces cerevisiae is well suited for heterologous gene expression. First, it is non-pathogenic to humans and it is incapable of producing certain endotoxins. Second, it has a long history of safe use following centuries of commercial exploitation for various purposes. This has led to wide public acceptability. Third, the extensive commercial use and research devoted to the organism has resulted in a wealth of knowledge about the genetics and physiology as well as large-scale fermentation characteristics of Saccharomyces cerevisiae.

[0215] A review of the principles of heterologous gene expression in Saccharomyces cerevisiae and secretion of gene products is given by E Hinchcliffe E Kenny (1993, “Yeast as a vehicle for the expression of heterologous genes”, Yeasts, Vol 5, Anthony H Rose and J Stuart Harrison, eds, 2nd edition, Academic Press Ltd.).

[0216] Several types of yeast vectors are available, including integrative vectors, which require recombination with the host genome for their maintenance, and autonomously replicating plasmid vectors.

[0217] In order to prepare the transgenic Saccharomyces, expression constructs are prepared by inserting the nucleotide sequence of the present invention into a construct designed for expression in yeast. Several types of constructs used for heterologous expression have been developed. The constructs contain a promoter active in yeast fused to the nucleotide sequence of the present invention, usually a promoter of yeast origin, such as the GAL1 promoter, is used. Usually a signal sequence of yeast origin, such as the sequence encoding the SUC2 signal peptide, is used. A terminator active in yeast ends the expression system.

[0218] For the transformation of yeast several transformation protocols have been developed. For example, a transgenic Saccharomyces according to the present invention can be prepared by following the teachings of Hinnen et al (1978, Proceedings of the National Academy of Sciences of the USA 75, 1929); Beggs, J D (1978, Nature, London, 275, 104); and Ito, H et al (1983, J Bacteriology 153, 163-168).

[0219] The transformed yeast cells are selected using various selective markers. Among the markers used for transformation are a number of auxotrophic markers such as LEU2, HIS4 and TRP1, and dominant antibiotic resistance markers such as aminoglycoside antibiotic markers, eg G418.

[0220] Another host organism is a plant. The basic principle in the construction of genetically modified plants is to insert genetic information in the plant genome so as to obtain a stable maintenance of the inserted genetic material. Several techniques exist for inserting the genetic information, the two main principles being direct introduction of the genetic information and introduction of the genetic information by use of a vector system. A review of the general techniques may be found in articles by Potrykus (Annu Rev Plant Physiol Plant Mol Biol [1991] 42:205-225) and Christou (Agro-Food-Industry Hi-Tech March/April 1994 17-27). Further teachings on plant transformation may be found in EP-A-0449375.

[0221] Further hosts suitable for the nucleotide sequence of the present invention include higher eukaryotic cells, such as insect cells or vertebrate cells, particularly mammalian cells, including human cells, or nucleated cells from other multicellular organisms. In recent years propagation of vertebrate cells in culture (tissue culture) has become a routine procedure. Examples of useful mammalian host cell lines are epithelial or fibroblastic cell lines such as Chinese hamster ovary (CHO) cells, NIH 3T3 cells, HeLa cells or 293T cells.

[0222] The nucleotide sequence of the present invention may be stably incorporated into host cells or may be transiently expressed using methods known in the art. By way of example, stably transfected mammalian cells may be prepared by transfecting cells with an expression vector having a selectable marker gene, and growing the transfected cells under conditions selective for cells expressing the marker gene. To prepare transient transfectants, mammalian cells are transfected with a reporter gene to monitor transfection efficiency.

[0223] To produce such stably or transiently transfected cells, the cells should be transfected with a sufficient amount of the nucleotide sequence of the present invention. The precise amounts of the nucleotide sequence of the present invention may be empirically determined and optimised for a particular cell and assay.

[0224] Thus, the present invention also provides a method of transforming a host cell with a nucleotide sequence that is to be the target or is to express the target. Host cells transformed with the nucleotide sequence may be cultured under conditions suitable for the expression of the encoded protein. The protein produced by a recombinant cell may be displayed on the surface of the cell. If desired, and as will be understood by those of skill in the art, expression vectors containing coding sequences can be designed with signal sequences which direct secretion of the coding sequences through a particular prokaryotic or eukaryotic cell membrane. Other recombinant constructions may join the coding sequence to nucleotide sequence encoding a polypeptide domain which will facilitate purification of soluble proteins (Kroll D J et al (1993) DNA Cell Biol 12:441-53).

[0225] Receptors

[0226] The RARβ2 receptor as discussed herein includes mimetics, homologues, fragments and part or all of the entire gene product. Preferably the RARβ2 receptor as discussed herein refers to substantially the entire gene product.

[0227] In one embodiment, the present invention relates to the use of a receptor in the production of neurite outgrowth. Previously, attempts have been made to produce neurite outgrowth using a number of different techniques. Typically, nerve growth factor (NGF) is used to stimulate neurite outgrowth. However, NGF is a relatively large molecule with a correspondingly high molecular weight, and is susceptible to protease mediated degradation. NGF is also relatively expensive to prepare. Similar approaches to the stimulation of neurite outgrowth have also encountered various difficulties. Moreover, such approaches have centred on the use of stimulatory factors such as growth factors in order to produce such desired phenotype(s). However, it is surprisingly shown herein that the long-felt need for the production of neurite outgrowth, for example in non-dividing cells, may be achieved using the converse approach disclosed herein, ie. the use of receptors to stimulate neurite outgrowth as described and demonstrated in the present invention. This disclosure runs against current thinking in the art, which has been focussed on the use of growth factors to try to elicit neurite outgrowth from non-dividing cells such as terminally differentiated neuronal cells. The surprising finding that receptor(s) may be delivered to such cells to produce neural regeneration/neurite outgrowth is illustrated herein by using RARβ2 as an example of this general approach. Thus, the present invention relates to the use of a receptor in the production of neurite outgrowth. The receptor may be any eukaryotic receptor, preferably a vertebrate receptor, more preferably a mammalian receptor, more preferably a primate receptor, most preferably a human receptor. Receptors for use in the present invention may comprise one or more membrane-spanning domain(s). In a preferred embodiment, receptors useful in the present invention are human receptors, without regard to their natural temporal and/or spatial expression profile. In a highly preferred embodiment, receptors useful in the present invention are human receptors which are not normally expressed in cell(s) of the adult target tissue. In a most highly preferred embodiment, receptors useful in the present invention are retinoic acid receptors such as RARs, such as in particular RARβ2. Receptor(s) useful in the present invention are preferably delivered to the target cell(s) using a vector system as described herein, such as a lentiviral vector system.

[0228] Neurological Disorders

[0229] Clearly, stimulation of neurite outgrowth according to the present invention will have therapeutic benefit in a number of pathologies. These include, but are not limited to, neurological disorders, for example degenerate neurological disorders such as Parkinson's disease, Alzheimer's syndrome, or related conditions, or neural injury such as spinal cord injury or other such physical condition.

[0230] The term neurological disorders as used herein may refer to any injury, whether mechanically (for example by trauma) or chemically induced (for example by neurotoxin(s), or by an regime of treatment having an immunosuppressant effect, whether by design, or as a side-effect), any neural pathology such as caused by viral infection or otherwise, any degenerative disorder, or other nerve tissue related disorder.

[0231] Examples of neurological disorders include conditions such as Parkinson's disease, Alzheimer's disease, senility, motor neurone disease, schizophrenia as well as other neural and/or neurodegenerative disorders. Other neural related disorders may include glaucoma or other cause of damage to the optic nerve, Bell's palsy or other forms of localised paralysis, neurally based impotence such as caused by nerve trauma following radical prostatectomy, or other complaints. Other disorders in which the invention may be useful include neuropathological effects of diabetes, AIDS neuropathy, leprosy etc.

[0232] The term neurological disorder refers to any disorder of a nervous system, whether the peripheral nervous system or the central nervous system (CNS), whether the sympathetic nervous system, or the parasympathetic nervous system, or whether affecting a subset or superset of different nerve types.

[0233] Nucleotide of Interest (NOI)

[0234] In accordance with the present invention, the NOI sequence may encode a peptide which peptide may be the pharmaceutically active agent—such as an RA receptor, preferably RARβ2, or an agonist thereof.

[0235] Such coding NOI sequences may be typically operatively linked to a suitable promoter capable of driving expression of the peptide, such as in one or more specific cell types.

[0236] In addition to the NOI or part thereof and the expression regulatory elements described herein, the delivery system may contain additional genetic elements for the efficient or regulated expression of the gene or genes, including promoters/enhancers, translation initiation signals, internal ribosome entry sites (IRES), splicing and polyadenylation signals.

[0237] The NOI or NOIs may be under the expression control of an expression regulatory element, usually a promoter or a promoter and enhancer. The enhancer and/or promoter may be preferentially active in neural cells, such that the NOI is preferentially expressed in the particular cells of interest, such as in nerve cells. Thus any significant biological effect or deleterious effect of the NOI on the individual being treated may be reduced or eliminated. The enhancer element or other elements conferring regulated expression may be present in multiple copies. Likewise, or in addition, the enhancer and/or promoter may be preferentially active in one or more specific cell types—such as neural cells for example post-mitotically terminally differentiated non-replicating cells such as neurons.

[0238] The term “promoter” is used in the normal sense of the art, e.g. an RNA polymerase binding site in the Jacob-Monod theory of gene expression.

[0239] The term “enhancer” includes a DNA sequence which binds to other protein components of the transcription initiation complex and thus facilitates the initiation of transcription directed by its associated promoter.

[0240] Expression Vector

[0241] Preferably, the NOI (e.g. that encoding RARβ2 or part thereof used in the method of the present invention is inserted into a vector which is operably linked to a control sequence that is capable of providing for the expression of the coding sequence by the host cell, i.e. the vector is an expression vector.

[0242] Codon Optimisation

[0243] As used herein, the terms “codon optimised” and “codon optimisation” refer to an improvement in codon usage. By way of example, alterations to the coding sequences for viral components may improve the sequences for codon usage in the mammalian cells or other cells which are to act as the producer cells for retroviral vector particle production. This is referred to as “codon optimisation”. Many viruses, including HIV and other lentiviruses, use a large number of rare codons and by changing these to correspond to commonly used mammalian codons, increased expression of the packaging components in mammalian producer cells can be achieved. Codon usage tables are known in the art for mammalian cells, as well as for a variety of other organisms.

[0244] Preferably a high titre lentiviral vector is produced using a codon optimised gag and a codon optimised pol or a codon optimised env (see seq. listing and/or WO99/41397).

[0245] Preferably a high titre retroviral vector is produced using a modified and/or extended packaging signal.

[0246] Packaging Signal

[0247] As used herein, the term “packaging signal” or “packaging sequence” refers to sequences located within the retroviral genome which are required for insertion of the viral RNA into the viral capsid or particle. Several retroviral vectors use the minimal packaging signal (also referred to as the psi sequence) needed for encapsidation of the viral genome. By way of example, this minimal packaging signal encompasses bases 212 to 563 of the Mo-MLV genome (Mann et al 1983: Cell 33: 153). As used herein, the term “extended packaging signal” or “extended packaging sequence” refers to the use of sequences around the psi sequence with further extension into the gag gene. The inclusion of these additional packaging sequences may increase the efficiency of insertion of vector RNA into viral particles.

[0248] Preferably a high titre lentiviral vector is produced using a modified packaging signal.

[0249] Preferably the lentiviral construct is a based on an EIAV vector genome where all the accessory genes are removed except Rev.

[0250] Accessory Genes

[0251] As used herein, the term “accessory genes” refer to a variety of virally encoded accessory proteins capable of modulating various aspects of retroviral replication and infectivity. These proteins are discussed in Coffin et al (ibid) (Chapters 6 and 7). Examples of accessory proteins in lentiviral vectors include but are not limited to tat, rev, nef, vpr, vpu, vif, vpx. An example of a lentiviral vector useful in the present invention is one which has all of the accessory genes removed except rev.

[0252] Transcriptional Control

[0253] The control of proviral transcription remains largely with the noncoding sequences of the viral LTR. The site of transcription initiation is at the boundary between U3 and R in the left hand side LTR and the site of poly (A) addition (termination) is at the boundary between R and U5 in the right hand side LTR. The 3′U3 sequence contains most of the transcriptional control elements of the provirus, which include the promoter and multiple enhancer sequences responsive to cellular and in some cases, viral transcriptional activator proteins.

[0254] An LTR present, for example, in a construct of the present invention and as a 3′LTR in the provirus of, for example, a target cell of the invention may be a native LTR or a heterologous regulatable LTR. It may also be a transcriptionally quiescent LTR for use in SIN vector technology.

[0255] The term “regulated LTR” also includes an inactive LTR such that the resulting provirus in the target cell can not produce a packagable viral genome (self-inactivating (SIN) vector technology).

[0256] Preferably the regulated retroviral vector of the present invention is a self-inactivating (SIN) vector.

[0257] Self-Inactivating (SIN) Vector

[0258] By way of example, self-inactivating retroviral vectors have been constructed by deleting the transcriptional enhancers or the enhancers and promoter in the U3 region of the 3′ LTR. After a round of vector reverse transcription and integration, these changes are copied into both the 5′ and the 3′ LTRs producing a transcriptionally inactive provirus (Yu et al 1986 Proc Natl Acad Sci 83: 3194-3198; Dougherty and Temin 1987 Proc Natl Acad Sci 84: 1197-1201; Hawley et al 1987 Proc Natl Acad Sci 84: 2406-2410; Yee et al 1987 Proc Natl Acad Sci 91: 9564-9568). However, any promoter(s) internal to the LTRs in such vectors will still be transcriptionally active. This strategy has been employed to eliminate effects of the enhancers and promoters in the viral LTRs on transcription from internally placed genes. Such effects include increased transcription (Jolly et al 1983 Nucleic Acids Res 11: 1855-1872) or suppression of transcription (Emerman and Temin 1984 Cell 39: 449467). This strategy can also be used to eliminate downstream transcription from the 3′ LTR into genomic DNA (Herman and Coffin 1987 Science 236: 845-848). This is of particular concern in human gene therapy where it is of critical importance to prevent the adventitious activation of an endogenous oncogene.

[0259] Targeted Vector

[0260] The term “targeted vector” refers to a vector whose ability to infect/transfect/transduce a cell or to be expressed in a host and/or target cell is restricted to certain cell types within the host organism, usually cells having a common or similar phenotype.

[0261] Preferably the targeted vector has a pseudotyped envelope gene in order to effectively transduce a specific cell type.

[0262] Envelope (ENV)

[0263] If the retroviral component includes an env nucleotide sequence, then all or part of that sequence can be optionally replaced with all or part of another env nucleotide sequence such as, by way of example, the amphotropic Env protein designated 4070A or the influenza haemagglutinin (HA) or the vesicular stomatitis virus G (VSV-G) protein. Replacement of the env gene with a heterologous env gene is an example of a technique or strategy called pseudotyping. Examples of pseudotyping may be found in WO-A-98/05759, WO-A-98/05754, WO-A-97/17457, WO-A-96/09400, WO-A-91/00047 and Mebatsion et al 1997 Cell 90, 841-847.

[0264] In one preferred aspect, the retroviral vector of the present invention has been pseudotyped. In this regard, pseudotyping can confer one or more advantages. For example, with the lentiviral vectors, the env gene product of the HIV based vectors would restrict these vectors to infecting only cells that express a protein called CD4. But if the env gene in these vectors has been substituted with env sequences from other RNA viruses, then they may have a broader infectious spectrum (Verma and Somia 1997 Nature 389:239-242). By way of example, workers have pseudotyped an HIV based vector with the glycoprotein from VSV (Verma and Somia 1997 ibid).

[0265] In another alternative, the Env protein may be a modified Env protein such as a mutant or engineered Env protein. Modifications may be made or selected to introduce targeting ability or to reduce toxicity or for another purpose (Valsesia-Wittman et al 1996 J Virol 70: 2056-64; Nilson et al 1996 Gene Therapy 3: 280-6; Fielding et al 1998 Blood 9: 1802 and references cited therein).

[0266] The term “retroviral vector particle” refers to the packaged retroviral vector, that is preferably capable of binding to and entering target cells. The components of the particle, as already discussed for the vector, may be modified with respect to the wild type retrovirus. For example, the Env proteins in the proteinaceous coat of the particle may be genetically modified in order to alter their targeting specificity or achieve some other desired function.

[0267] Preferably, the viral vector preferentially transduces a certain cell type or cell types.

[0268] More preferably, the viral vector is a targeted vector, that is it has a tissue tropism which is altered compared to the native virus, so that the vector is targeted to particular cells.

[0269] For retroviral vectors, this may be achieved by modifying the Env protein. The Env protein of the retroviral secondary vector needs to be a non-toxic envelope or an envelope which may be produced in non-toxic amounts within the primary target cell, such as for example a MMLV amphotropic envelope or a modified amphotropic envelope. The safety feature in such a case is preferably the deletion of regions or sequence homology between retroviral components.

[0270] Preferably the envelope is one which allows transduction of human cells. Examples of suitable env genes include, but are not limited to, VSV-G, a MLV amphotropic env such as the 4070A env, the RD114 feline leukaemia virus env or haemagglutinin (HA) from an influenza virus. The Env protein may be one which is capable of binding to a receptor on a limited number of human cell types and may be an engineered envelope containing targeting moieties. The env and gag-pol coding sequences are transcribed from a promoter and optionally an enhancer active in the chosen packaging cell line and the transcription unit is terminated by a polyadenylation signal. For example, if the packaging cell is a human cell, a suitable promoter-enhancer combination is that from the human cytomegalovirus major immediate early (hCMV-MIE) gene and a polyadenylation signal from SV40 virus may be used. Other suitable promoters and polyadenylation signals are known in the art.

[0271] The packaging cell may be an in vivo packaging cell in the body of an individual to be treated or it may be a cell cultured in vitro such as a tissue culture cell line. Suitable cell lines include mammalian cells such as murine fibroblast derived cell lines or human cell lines. Preferably the packaging cell line is a human cell line, such as for example: 293 cell line, HEK293, 293-T, TE671, HT1080.

[0272] Alternatively, the packaging cell may be a cell derived from the individual to be treated such as a monocyte, macrophage, stem cells, blood cell or fibroblast. The cell may be isolated from an individual and the packaging and vector components administered ex vivo followed by re-administration of the autologous packaging cells. Alternatively the packaging and vector components may be administered to the packaging cell in vivo. Methods for introducing retroviral packaging and vector components into cells of an individual are known in the art. For example, one approach is to introduce the different DNA sequences that are required to produce a retroviral vector particle e.g. the env coding sequence, the gag-pol coding sequence and the defective retroviral genome into the cell simultaneously by transient triple transfection (Landau & Littman 1992 J. Virol. 66, 5110; Soneoka et al 1995 Nucleic Acids Res 23:628-633).

[0273] In one embodiment the vector configurations of the present invention use as their production system, three transcription units expressing a genome, the gag-pol components and an envelope. The envelope expression cassette may include one of a number of envelopes such as VSV-G or various murine retrovirus envelopes such as 4070A.

[0274] Conventionally these three cassettes would be expressed from three plasmids transiently transfected into an appropriate cell line such as 293T or from integrated copies in a stable producer cell line. An alternative approach is to use another virus as an expression system for the three cassettes, for example baculovirus or adenovirus. These are both nuclear expression systems. To date the use of a poxvirus to express all of the components of a retroviral or lentiviral vector system has not been described. In particular, given the unusual codon usage of lentiviruses and their requirement for RNA handling systems such as the rev/RRE system it has not been clear whether incorporation of all three cassettes and their subsequent expression in a vector that expresses in the cytoplasm rather than the nucleus is feasible. Until now the possibility remained that key nuclear factors and nuclear RNA handling pathways would be required for expression of the vector components and their function in the gene delivery vehicle. Here we describe such a system and show that lentiviral components can be made in the cytoplasm and that they assemble into functional gene delivery systems. The advantage of this system is the ease with which poxviruses can be handled, the high expression levels and the ability to retain introns in the vector genomes.

[0275] According to another aspect therefore there is provided a hybrid viral vector system for in vivo gene delivery, which system comprises a primary viral vector which is obtainable from or is based on a poxvirus and a second viral vector which is obtainable from or is based on a retroviral vector, preferably a lentiviral vector, even more preferably a non-primate lentiviral vector, even more preferably an EIAV.

[0276] The secondary vector may be produced from expression of essential genes for retroviral vector production encoded in the DNA of the primary vector. Such genes may include a gag-pol from a retrovirus, an env gene from an enveloped virus and a defective retroviral vector containing one or more therapeutic or diagnostic NOI(s). The defective retroviral vector contains in general terms sequences to enable reverse transcription, at least part of a 5′ long terminal repeat (LTR), at least part of a 3′LTR and a packaging signal.

[0277] If it is desired to render the secondary vector replication defective, that secondary vector may be encoded by a plurality of transcription units, which may be located in a single or in two or more adenoviral or other primary vectors.

[0278] In some therapeutic or experimental situations it may be desirable to obviate the need to make EIAV derived from MVA in vitro. MVA-EIAV hybrids are delivered directly into the patient/animal e.g. MVA-EIAV is injected intravenously into the tail vein of a mouse and this recombinant virus infects a variety of murine tissues e.g. lung, spleen etc. Infected cells express transduction competent EIAV containing a therapeutic gene for gene therapy for example. EIAV vector particles bud from these cells and transduce neighbouring cells. The transduced cell then contains an integrated copy of the EIAV vector genome and expresses the therapeutic gene product or other gene product of interest. If expression of the therapeutic gene product is potentially toxic to the host it may be regulated by a specific promoter, e.g. the hypoxic response element (HRE), which will restrict expression to those cells in a hypoxic environment. For gene therapy of lung/trachea epithelium cells e.g to treat cystic fibrosis MVA-EIAV may be given as an aerosol delivered intranasally. Alternatively, macrophages can be transduced in vitro and then reintroduced to create macrophage factories for EIAV-based vectors. Furthermore, because MVA is replication incompetent MVA-EIAV hybrids could also be used to treat immuno-suppressed hosts.

[0279] Vaccinia virus, the prototypic member of the orthopox genus within the family poxviridae, was the first virus used for expression of recombinant exogenous proteins (Mackett et al 1982, Paoletti & Panicalli 1982). Vaccinia virus has a large DNA genome of greater than 180 kb and reports indicate that it can accommodate over 25 kb of foreign DNA (Merchlinsky & Moss 1992). Several other strains of poxviruses have been adapted as recombinant expression vectors (for review see Carroll and Moss 1997) e.g. fowlpox (Taylor & Paoletti 1988), canarypox (Taylor et al 1991), swinepox (van der Leek et al 1994) and entomopox (Li et al 1997). Additionally, due to safety concerns, several highly attenuated strains of vaccinia virus have been developed that are compromised in human and other mammalian cells e.g. modified vaccinia virus Ankara (MVA) (Mayr 1978, Sufter 1992), NYVAC (Paolefti et al 1994), vaccinia virus deficient in a DNA replication enzyme (Holzer et al 1997). These may all be used in the present invention.

[0280] MVA was derived from a replication competent vaccinia smallpox vaccine strain, Ankara. After >500 passages in chick embryo fibroblast cells the virus isolate was shown to be highly attenuated in a number of animal models including mice that were immune deficient (Mayr et al 1978). The attenuated isolate, MVA, was used to vaccinate over 120,000 people, many of which were immunocompromised (Mahnel 1994) without adverse effects. Studies illustrate that MVA can infect a wide range of mammalian cells but productive infection has only been observed in Hamster kidney cell BHK-21 (Carroll 1997). In all other tested mammalian cell lines early expression, DNA replication and late expression are observed leading to the production of non-infectious immature virus particles (Carroll 1997, Meyer 1991). Virus replication studies show that a minority of mammalian cells can support very low level production of infectious virus i.e. BS-C-1 cells in which 1 infectious MVA particle is produced per cell (Carroll and Moss 1997). Late gene expression usually give rise to >10 fold more protein that those genes under early promoters (Chakrabarti et al 1997, Wyatt et al 1996). In all other attenuated poxvirus strains late gene expression is rarely observed in mammalian cells.

[0281] Production of retrovirus vector systems e.g. MLV-HIV and lentivirus vector systems requires the construction of producer lines that express the virus genome and essential structural proteins to make transduction competent virus. Generally, this is a relatively inefficient process which is further complicated when the virus is pseudotyped with toxic envelope proteins such as VSV-G. Expression of a functional genome and the required structural proteins from within a recombinant poxvirus may obviate many of the current inefficient retrovirus and lentivirus vector production technologies. Additionally, such recombinant poxviruses may be directly injected into patients to give rise to in vivo production of retrovirus or lentivirus.

[0282] MVA is a particularly suitable poxvirus for the construction of a pox-retrovirus or pox-lentivirus hybrid due to its non-replicating phenotype and its ability to perform both early and strong late expression for the production of high titre vector preparations.

[0283] Replication Vectors

[0284] The nucleotide sequences encoding the of the present invention may be incorporated into a recombinant replicable vector. The vector may be used to replicate the nucleotide sequence in a compatible host cell. Thus in one embodiment of the present invention, the invention provides a method of making the RARβ2 of the present invention by introducing a nucleotide sequence of the present invention into a replicable vector, introducing the vector into a compatible host cell, and growing the host cell under conditions which bring about replication of the vector. The vector may be recovered from the host cell.

[0285] Host/Target Cells

[0286] Host and/or target cells comprising nucleotide sequences of the present invention may be used to express the RARβ2 of the present invention under in vitro, in vivo and ex vivo conditions.

[0287] The term “host cell” and/or “target cell” includes any cell derivable from a suitable organism which a vector is capable of transfecting or transducing. Examples of host and/or target cells can include but are not limited to cells capable of expressing the RARβ2 of the present invention under in vitro, in vivo and ex vivo conditions. Examples of such cells include but are not limited to neuronal cells, nerve cells, post-mitotically terminally differentiated non-replicating cells such as neurons or combinations thereof.

[0288] In a preferred embodiment, the cell is a mammalian cell.

[0289] In a highly preferred embodiment, the cell is a human cell.

[0290] The term “organism” includes any suitable organism. In a preferred embodiment, the organism is a mammal. In a highly preferred embodiment, the organism is a human.

[0291] The present invention also provides a method comprising transforming a host and/or target cell with a or the nucleotide sequence(s) of the present invention.

[0292] The term “transformed cell” means a host cell and/or a target cell having a modified genetic structure. With the present invention, a cell has a modified genetic structure when a vector according to the present invention has been introduced into the cell.

[0293] Regulation of Expression in vitro/vivo/ex vivo

[0294] The present invention also encompasses gene therapy whereby the RARβ2 encoding nucleotide sequence(s) of the present invention is regulated in vitrolin vivolex vivo. For example, expression regulation may be accomplished by administering compounds that bind to the RARβ2 encoding nucleotide sequence(s) of the present invention, or control regions associated with the RARβ2 encoding nucleotide sequence of the present invention, or its corresponding RNA transcript to modify the rate of transcription or translation.

[0295] Control Sequences

[0296] Control sequences operably linked to sequences encoding the RARβ2 of the present invention include promoters/enhancers and other expression regulation signals. These control sequences may be selected to be compatible with the host cell and/or target cell in which the expression vector is designed to be used. The control sequences may be modified, for example by the addition of further transcriptional regulatory elements to make the level of transcription directed by the control sequences more responsive to transcriptional modulators.

[0297] Operably Linked

[0298] The term “operably linked” means that the components described are in a relationship permitting them to function in their intended manner. A regulatory sequence “operably linked” to a coding sequence is ligated in such a way that expression of the coding sequence is achieved under condition compatible with the control sequences.

[0299] Preferably the nucleotide sequence of the present invention is operably linked to a transcription unit.

[0300] The term “transcription unit(s)” as described herein are regions of nucleic acid containing coding sequences and the signals for achieving expression of those coding sequences independently of any other coding sequences. Thus, each transcription unit generally comprises at least a promoter, an optional enhancer and a polyadenylation signal.

[0301] Promoters

[0302] The term promoter is well-known in the art and is used in the normal sense of the art, e.g. an RNA polymerase binding site. The term encompasses nucleic acid regions ranging in size and complexity from minimal promoters to promoters including upstream elements and enhancers.

[0303] The promoter is typically selected from promoters which are functional in mammalian, cells, although prokaryotic promoters and promoters functional in other eukaryotic cells may be used. The promoter is typically derived from promoter sequences of viral or eukaryotic genes. For example, it may be a promoter derived from the genome of a cell in which expression is to occur. With respect to eukaryotic promoters, they may be promoters that function in a ubiquitous manner (such as promoters of α-actin, β-actin, tubulin) or, alternatively, a tissue-specific manner (such as promoters of the genes for pyruvate kinase).

[0304] Preferably the promoter is a constitutive promoter such as CMV.

[0305] Preferably the promoters of the present invention are tissue specific.

[0306] Tissue-Specific Promoters

[0307] The promoters of the present invention may be tissue-specific promoters. Examples of suitable tissue restricted promoters/enhancers are those which are highly active in tumour cells such as a promoter/enhancer from a MUC1 gene, a CEA gene or a 5T4 antigen gene. Examples of temporally restricted promoters/enhancers are those which are responsive to ischaemia and/or hypoxia, such as hypoxia response elements or the promoter/enhancer of a grp78 or a grp94 gene. The alpha fetoprotein (AFP) promoter is also a tumour-specific promoter. One preferred promoter-enhancer combination is a human cytomegalovirus (hCMV) major immediate early (MIE) promoter/enhancer combination.

[0308] Preferably the promoters of the present invention are tissue specific. That is, they are capable of driving transcription of a RARβ2 encoding nucleotide sequence(s) in one tissue while remaining largely “silent” in other tissue types.

[0309] The term “tissue specific” means a promoter which is not restricted in activity to a single tissue type but which nevertheless shows selectivity in that they may be active in one group of tissues and less active or silent in another group.

[0310] The level of expression of a or the RARβ2 encoding nucleotide sequence(s) under the control of a particular promoter may be modulated by manipulating the promoter region. For example, different domains within a promoter region may possess different gene regulatory activities. The roles of these different regions are typically assessed using vector constructs having different variants of the promoter with specific regions deleted (that is, deletion analysis). This approach may be used to identify, for example, the smallest region capable of conferring tissue specificity.

[0311] A number of tissue specific promoters, described above, may be particularly advantageous in practising the present invention. In most instances, these promoters may be isolated as convenient restriction digestion fragments suitable for cloning in a selected vector. Alternatively, promoter fragments may be isolated using the polymerase chain reaction. Cloning of the amplified fragments may be facilitated by incorporating restriction sites at the 5′ end of the primers. Preferably, a tissue-specific promoter used herein is specific for neuronal cells.

[0312] Inducible Promoters

[0313] The promoters of the present invention may also be promoters that respond to specific stimuli, for example promoters that bind steroid hormone receptors. Viral promoters may also be used, for example the Moloney murine leukaemia virus long terminal repeat (MMLV LTR) promoter, the rous sarcoma virus (RSV) LTR promoter or the human cytomegalovirus (CMV) IE promoter.

[0314] It may also be advantageous for the promoters to be inducible so that the levels of expression of the heterologous gene can be regulated during the life-time of the cell. Inducible means that the levels of expression obtained using the promoter can be regulated.

[0315] Enhancer

[0316] In addition, any of these promoters may be modified by the addition of further regulatory sequences, for example enhancer sequences. Chimeric promoters may also be used comprising sequence elements from two or more different promoters described above.

[0317] The term “enhancer” includes a DNA sequence which binds to other protein components of the transcription initiation complex and thus facilitates the initiation of transcription directed by its associated promoter.

[0318] The in vitro/in vivolex vivo expression of the RARβ2 of the present invention may be used in combination with a protein of interest (POI) or a nucleotide sequence of interest (NOI) encoding same.

[0319] POIs and NOIs

[0320] Suitable proteins of interest (POIs) or NOIs encoding same for use in the present invention include those that are of therapeutic and/or diagnostic application such as, but are not limited to: sequences encoding cytokines, chemokines, hormones, antibodies, engineered immunoglobulin-like molecules, a single chain antibody, fusion proteins, enzymes, immune co-stimulatory molecules, immunomodulatory molecules, anti-sense RNA, a transdominant negative mutant of a target protein, a toxin, a conditional toxin, an antigen, a tumour suppressor protein and growth factors, membrane proteins, vasoactive proteins and peptides, anti-viral proteins and ribozymes, and derivatives therof (such as with an associated reporter group). When included, the POIs or NOIs encoding same may be typically operatively linked to a suitable promoter, which may be a promoter driving expression of a ribozyme(s), or a different promoter or promoters, such as in one or more specific cell types.

[0321] Cytokines

[0322] In one aspect of the present invention the NOI(s) encodes a POI(s) wherein the POI is a cytokine or a cytokine receptor.

[0323] As used herein, the term “cytokines” refers to any varied group of proteins that are released from mammalian cells and act on other cells through specific receptors, said term also including said receptors. The term “cytokine” is often used interchangeably with the term “mediator”. Cytokines may elicit from the target cell a variety of responses depending on the cytokine and the target cell. By way of example, cytokines may be important in signalling between cells as inflammatory reactions develop. In the initial stages, cytokines such as IL-1 and IL-6 may be released from cells of the tissue where the inflammatory reaction is occurring. Once lymphocytes and mononuclear cells have started to enter the inflammatory site, they may become activated by antigen and release cytokines of their own such as IL-1, TNF, IL4 and IFNγ which further enhance cellular migration by their actions on the local endothelium. Other cytokines, such as IL-8, are chemotactic or can activate incoming cells. The term “cytokine” includes but is not limited to factors such as cardiotrophin, EGF, FGF-acidic, FGF-basic, flt3 Ligand, G-CSF, GM-CSF, IFN-γ, IGF-I, IGF-III, IL-1α, IL-1β, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, IL-13, IL-15, IL-16, IL-17, IL-18 (IGIF), KGF, LIF, M-CSF, Oncostatin M, PDGF-A, PDGF-AB, PDGF-BB, SCF, SCGF, TGF-α, TGF-β₁, TNF-α, TNF-β, TPO and VEGF, as well as their cognate receptors.

[0324] Coupling

[0325] The RARβ2 of the present invention can be coupled to other molecules using standard methods. The amino and carboxyl termini of RARβ2 may be isotopically and nonisotopically labeled with many techniques, for example radiolabeling using conventional techniques (tyrosine residues—chloramine T, iodogen, lactoperoxidase; lysine residues—Bolton-Hunter reagent). These coupling techniques are well known to those skilled in the art. The coupling technique is chosen on the basis of the functional groups available on the amino acids including, but not limited to amino, sulfhydral, carboxyl, amide, phenol, and imidazole. Various reagents used to effect these couplings include among others, glutaraldehyde, diazotized benzidine, carbodiimide, and p-benzoquinone.

[0326] Chemical Coupling

[0327] The RARβ2 of the present invention may be chemically coupled to isotopes, enzymes, carrier proteins, cytotoxic agents, fluorescent molecules, radioactive nucleotides and other compounds for a variety of applications including but not limited to imaging/prognosis, diagnosis and/or therapy.

[0328] Imaging

[0329] The use of labelled RARβ2 of the present invention with short lived isotopes enables visualization quantitation of RARβ2 binding sites in vivo using autoradiographic, or modern radiographic or other membrane binding techniques such as positron emission tomography in order to locate tumours with RARβ2 binding sites. This application provides important diagnostic and/or prognostic research tools.

[0330] Conjugates

[0331] In other embodiments, the RARβ2 of the invention is coupled to a scintigraphic radiolabel, a cytotoxic compound or radioisotope, an RARβ2 for converting a non-toxic prodrug into a cytotoxic drug, a compound for activating the immune system in order to target the resulting conjugate to a disease site such as a colon tumour, or a cell-stimulating compound. Such conjugates have a “binding portion”, which consists of the RARβ2 of the invention, and a “functional portion”, which consists of the radiolabel,

[0332] Individual

[0333] As used herein, the term “individual” refers to vertebrates, particularly members of the mammalian species, more in particular, humans.

[0334] Treatment

[0335] It is to be appreciated that all references herein to treatment include curative, palliative and prophylactic treatment.

[0336] Dosage

[0337] The dosage of the RARβ2 and/or pharmaceutical composition of the present invention will depend on the disease state or condition being treated and other clinical factors such as weight and condition of the individual and the route of administration of the compound. Depending upon the half-life of the RARβ2 in the particular individual, the RARβ2 and/or pharmaceutical composition can be administered between several times per day to once a week. It is to be understood that the present invention has application for both human and veterinary use. The methods of the present invention contemplate single as well as multiple administrations, given either simultaneously or over an extended period of time.

[0338] Typically, a physician will determine the actual dosage which will be most suitable for an individual subject and it will vary with the age, weight and response of the particular patient and severity of the condition. The dosages below are exemplary of the average case. There can, of course, be individual instances where higher or lower dosage ranges are merited.

[0339] In addition or in the alternative the compositions (or component parts thereof) of the present invention may be administered by direct injection. In addition or in the alternative the compositions (or component parts thereof) of the present invention may be administered topically. In addition or in the alternative the compositions (or component parts thereof) of the present invention may be administered by inhalation. In addition or in the alternative the compositions (or component parts thereof) of the present invention may also be administered by one or more of: a mucosal route, for example, as a nasal spray or aerosol for inhalation or as an ingestable solution such as by an oral route, or by a parenteral route where delivery is by an injectable form, such as, for example, by a rectal, ophthalmic (including intravitreal or intracameral), nasal, topical (including buccal and sublingual), intrauterine, vaginal or parenteral (including subcutaneous, intraperitoneal, intramuscular, intravenous, intradermal, intracranial, intratracheal, and epidural) transdermal, intraperitoneal, intracranial, intracerebroventricular, intracerebral, intravaginal, intrauterine, or parenteral (e.g., intravenous, intraspinal, intracavernosal, subcutaneous, transdermal or intramuscular) route.

[0340] By way of further example, the pharmaceutical composition of the present invention may be administered in accordance with a regimen of 1 to 10 times per day, such as once or twice per day. The specific dose level and frequency of dosage for any particular patient may be varied and will depend upon a variety of factors including the activity of the specific compound employed, the metabolic stability and length of action of that compound, the age, body weight, general health, sex, diet, mode and time of administration, rate of excretion, drug combination, the severity of the particular condition, and the individual undergoing therapy.

[0341] Disorders

[0342] The present invention is believed to have a wide therapeutic applicability.

[0343] For example, the present invention may be useful in the treatment of the disorders listed in WO-A-98/05635. For ease of reference, part of that list is now provided: cancer, inflammation or inflammatory disease, dermatological disorders, fever, cardiovascular effects, haemorrhage, coagulation and acute phase response, cachexia, anorexia, acute infection, HIV infection, shock states, graft-versus-host reactions, autoimmune disease, reperfusion injury, meningitis, migraine and aspirin-dependent anti-thrombosis; tumour growth, invasion and spread, angiogenesis, metastases, malignant, ascites and malignant pleural effusion; cerebral ischaemia, ischaemic heart disease, osteoarthritis, rheumatoid arthritis, osteoporosis, asthma, multiple sclerosis, neurodegeneration, Alzheimer's disease, atherosclerosis, stroke, vasculitis, Crohn's disease and ulcerative colitis; periodontitis, gingivitis; psoriasis, atopic dermatitis, chronic ulcers, epidermolysis bullosa; corneal ulceration, retinopathy and surgical wound healing; rhinitis, allergic conjunctivitis, eczema, anaphylaxis; restenosis, congestive heart failure, endometriosis, atherosclerosis or endosclerosis.

[0344] In addition, or in the alternative, the present invention may be useful in the treatment of disorders listed in WO-A-98/07859. For ease of reference, part of that list is now provided: cytokine and cell proliferation/differentiation activity; immunosuppressant or immunostimulant activity (e.g. for treating immune deficiency, including infection with human immune deficiency virus; regulation of lymphocyte growth; treating cancer and many autoimmune diseases, and to prevent transplant rejection or induce tumour immunity); regulation of haematopoiesis, e.g. treatment of myeloid or lymphoid diseases; promoting growth of bone, cartilage, tendon, ligament and nerve tissue, e.g. for healing wounds, treatment of bums, ulcers and periodontal disease and neurodegeneration; inhibition or activation of follicle-stimulating hormone (modulation of fertility); chemotactic/chemokinetic activity (e.g. for mobilising specific cell types to sites of injury or infection); haemostatic and thrombolytic activity (e.g. for treating haemophilia and stroke); antiinflammatory activity (for treating e.g. septic shock or Crohn's disease); as antimicrobials; modulators of e.g. metabolism or behaviour; as analgesics; treating specific deficiency disorders; in treatment of e.g. psoriasis, in human or veterinary medicine.

[0345] In addition, or in the alternative, the present invention may be useful in the treatment of disorders listed in WO-A-98/09985. For ease of reference, part of that list is now provided: macrophage inhibitory and/or T cell inhibitory activity and thus, anti-inflammatory activity; anti-immune activity, i.e. inhibitory effects against a cellular and/or humoral immune response, including a response not associated with inflammation; inhibit the ability of macrophages and T cells to adhere to extracellular matrix components and fibronectin, as well as up-regulated fas receptor expression in T cells; inhibit unwanted immune reaction and inflammation including arthritis, including rheumatoid arthritis, inflammation associated with hypersensitivity, allergic reactions, asthma, systemic lupus erythematosus, collagen diseases and other autoimmune diseases, inflammation associated with atherosclerosis, arteriosclerosis, atherosclerotic heart disease, reperfusion injury, cardiac arrest, myocardial infarction, vascular inflammatory disorders, respiratory distress syndrome or other cardiopulmonary diseases, inflammation associated with peptic ulcer, ulcerative colitis and other diseases of the gastrointestinal tract, hepatic fibrosis, liver cirrhosis or other hepatic diseases, thyroiditis or other glandular diseases, glomerulonephritis or other renal and urologic diseases, otitis or other oto-rhino-laryngological diseases, dermatitis or other dermal diseases, periodontal diseases or other dental diseases, orchitis or epididimo-orchitis, infertility, orchidal trauma or other immune-related testicular diseases, placental dysfunction, placental insufficiency, habitual abortion, eclampsia, pre-eclampsia and other immune and/or inflammatory-related gynaecological diseases, posterior uveitis, intermediate uveitis, anterior uveitis, conjunctivitis, chorioretinitis, uveoretinitis, optic neuritis, intraocular inflammation, e.g. retinitis or cystoid macular oedema, sympathetic ophthalmia, scleritis, retinitis pigmentosa, immune and inflammatory components of degenerative fondus disease, inflammatory components of ocular trauma, ocular inflammation caused by infection, proliferative vitreo-retinopathies, acute ischaemic optic neuropathy, excessive scarring, e.g. following glaucoma filtration operation, immune and/or inflammation reaction against ocular implants and other immune and inflammatory-related ophthalmic diseases, inflammation associated with autoimmune diseases or conditions or disorders where, both in the central nervous system (CNS) or in any other organ, immune and/or inflammation suppression would be beneficial, Parkinson's disease, complication and/or side effects from treatment of Parkinson's disease, AIDS-related dementia complex HIV-related encephalopathy, Devic's disease, Sydenham chorea, Alzheimer's disease and other degenerative diseases, conditions or disorders of the CNS, inflammatory components of stokes, post-polio syndrome, immune and inflammatory components of psychiatric disorders, myelitis, encephalitis, subacute sclerosing pan-encephalitis, encephalomyelitis, acute neuropathy, subacute neuropathy, chronic neuropathy, Guillaim-Barre syndrome, Sydenham chora, myasthenia gravis, pseudo-tumour cerebri, Down's Syndrome, Huntington's disease, amyotrophic lateral sclerosis, inflammatory components of CNS compression or CNS trauma or infections of the CNS, inflammatory components of muscular atrophies and dystrophies, and immune and inflammatory related diseases, conditions or disorders of the central and peripheral nervous systems, post-traumatic inflammation, septic shock, infectious diseases, inflammatory complications or side effects of surgery, bone marrow transplantation or other transplantation complications and/or side effects, inflammatory and/or immune complications and side effects of gene therapy, e.g. due to infection with a viral carrier, or inflammation associated with AIDS, to suppress or inhibit a humoral and/or cellular immune response, to treat or ameliorate monocyte or leukocyte proliferative diseases, e.g. leukaemia, by reducing the amount of monocytes or lymphocytes, for the prevention and/or treatment of graft rejection in cases of transplantation of natural or artificial cells, tissue and organs such as comea, bone marrow, organs, lenses, pacemakers, natural or artificial skin tissue.

[0346] In particular, the present invention may be useful in the treatment of neurological disorders or injuries as discussed herein.

[0347] Delivery

[0348] The delivery system for use in the present invention may be any suitable delivery system for delivering said NOI and providing said NOI is expressed in vivo to produce said associated peptide (e.g. RARβ2), which in turn provides the beneficial therapeutic effect.

[0349] The delivery system may be a viral delivery system. Viral delivery systems include but are not limited to adenovirus vector, an adeno-associated viral (AAV) vector, a herpes viral vector, retroviral vector, lentiviral vector, baculoviral vector. Alternatively, the delivery system may be a non-viral delivery system—such as by way of example DNA transfection methods of, for example, plasmids, chromosomes or artificial chromosomes. Here transfection includes a process using a non-viral vector to deliver a gene to a target mammalian cell. Typical transfection methods include electroporation, DNA biolistics, lipid-mediated transfection, compacted DNA-mediated transfection, liposomes, immunoliposomes, lipofectin, cationic agent-mediated, cationic facial amphiphiles (CFAs) (Nature Biotechnology 1996 14; 556), and combinations thereof.

[0350] Other examples of vectors include ex vivo delivery systems—which include but are not limted to DNA transfection methods such as electroporation, DNA biolistics, lipid-mediated transfection, compacted DNA-mediated transfection).

[0351] In a preferred aspect, the delivery system is a vector.

[0352] In a more preferred aspect, the delivery system is a viral delivery system—sometimes referred to as a viral vector.

[0353] Vectors

[0354] As it is well known in the art, a vector is a tool that allows or faciliates the transfer of an entity from one environment to another. By way of example, some vectors used in recombinant DNA techniques allow entities, such as a segment of DNA (such as a heterologous DNA segment, such as a heterologous cDNA segment), to be transferred into a target cell. Optionally, once within the target cell, the vector may then serve to maintain the heterologous DNA within the cell or may act as a unit of DNA replication. Examples of vectors used in recombinant DNA techniques include plasmids, chromosomes, artificial chromosomes or viruses.

[0355] The term “vector” includes expression vectors and/or transformation vectors.

[0356] The term “expression vector” means a construct capable of in vivo or in vitro/ex vivo expression.

[0357] The term “transformation vector” means a construct capable of being transferred from one species to another.

[0358] Viral Vectors

[0359] In the present invention, the NOI may be introduced into suitable host cells using a viral delivery system (a viral vector). A variety of viral techniques are known in the art, such as for example infection with recombinant viral vectors such as DNA viruses, retroviruses, herpes simplex viruses, adenoviruses and adeno-associated viruses.

[0360] Suitable recombinant viral vectors include but are not limited to adenovirus vectors, adeno-associated viral (AAV) vectors, herpes-virus vectors, a retroviral vector, lentiviral vectors, baculoviral vectors, pox viral vectors or parvovirus vectors (see Kestler et al 1999 Human Gene Ther 10(10):1619-32). In the case of viral vectors, gene delivery is typically mediated by viral infection of a target cell.

[0361] Herpes Virus Based Vectors

[0362] Herpes simplex viruses (HSV) I and II are large linear DNA viruses of approximately 150 kb encoding 70-80 genes. Like adenoviruses, HSV can infect a wide variety of cell types, including muscle, tumours, lung, liver and pancreatic islets. The viruses are able both to infect cells lytically and to establish latency in specific cell types, such as neurons. In order to use HSV as a vector, it is rendered replication defective. Following infection of a cell with HSV, the expression of a small number of immediate early (IE) genes is induced by a viral transactivating protein, VP16, which is carried into the cell as part of the viral tegument. The IE genes, which include ICPO, 4, 6, 22 and 27, are themselves regulators of gene expression that are important for the induction of the early and late genes required for viral replication and encapsidation. Mutation of ICP4 results in a virus unable to replicate except in a complementing cell line, but which still expresses the other IE gene products; these other IE proteins are toxic to many cell types. Vectors defective for ICP4, 22 and 27 have been generated that have reduced levels of toxicity and prolonged gene expression in culture and in vivo. Herpes simplex virus can infect non-dividing cells of the mammalian nervous system.

[0363] An alternative approach to producing infectious HSV vectors is the use of amplicons. In this approach, a plasmid containing an HSV origin of replication and packaging sequence is cotransfected with cosmids containing the HSV genome but with a defective packaging sequence. The resulting virus particles contain only plasmid nucleic acid sequences, thereby eliminating any toxicity associated with low-level HSV-protein expression. This approach generates a helper free stock of virus. HSV vectors have a large capacity for inserting heterologous DNA, allowing up to 50 kb to be included successfully, which may comprise multiple therapeutic genes. For example, four different antitumour genes have been inserted into a single HSV vector for use in cancer therapy. HSV vectors can be used to obtain highly regulated gene expression. An RU486-hormone-regulated chimeric transcription factor has been inserted into HSV along with a promoter containing binding sites for the regulated transcription factor; specific, regulated gene expression has been observed in vivo. Essentially all of the viral proteins may be deleted (gut-less vectors), still allowing around 10⁶ viral particles to be produced per ml.

[0364] Adeno-Associated Viral Vectors

[0365] Adeno-associated virus (MV) is a member of the parvovirus family, small single-stranded DNA viruses that require a helper virus, such as adenovirus or herpes-simplex virus, for replication. MV is a human virus, with the majority of the population being seropositive for MV, but no pathology has been associated with it. The virus contains two genes, rep and cap, encoding polypeptides important for replication and encapsidation, respectively. The wild-type virus can be grown to high titres and is able to integrate stably into a specific region of chromosome 19 following infection. The recombinant virus may not always integrate site-specifically. It has been suggested that this integration requires the presence of the rep protein. In wild-type virus infection, second-strand synthesis is stimulated by the presence of adenovirus E1 and E4 proteins; in the absence of adenovirus coinfection, cellular factors appear to dictate the rate of second-strand synthesis. In certain cell types, and/or following treatment with DNA-damaging agents, the rate of second-strand synthesis is high.

[0366] For the production of viral vectors, these two genes can be supplied in trans with only the inverted terminal repeats (ITRs) required in cis for viral replication. Therapeutic genes with the appropriate regulatory sequences can be inserted between the two ITRs, and the viral vector generated by cotransfection into the 293 cell line with a rep and cap expression vector and subsequent infection with a first-generation adeno-viral vector.

[0367] The degree of MV infection of muscle, brain and liver cells with recombinant virus is exceedingly high in vivo. In these cell types, stable infection and gene expression apparently occurs independently of the helper virus. Injection of a β-galactosidase containing MV vector into muscle also has resulted in β-galactosidase-positive myofibres for up to two years. Similarly, the injection of virus into the brain has resulted in long-term gene expression. AAV vectors containing human factor IX complementary DNA have been used to infect liver and muscle cells in immunocompetent mice. The mice produced therapeutic amounts of factor IX protein in their blood for over six months, confirming the utility of MV as a viral vector. MV is highly suitable for the delivery of genes to specific target cells in vivo, preferably without inducing an immune response to the infected cells.

[0368] Retroviral Vectors

[0369] Examples of retroviruses include but are not limited to: murine leukemia virus (MLV), human immunodeficiency virus (HIV), equine infectious anaemia virus (EIAV), mouse mammary tumour virus (MMTV), Rous sarcoma virus (RSV), Fujinami sarcoma virus (FuSV), Moloney murine leukemia virus (Mo-MLV), FBR murine osteosarcoma virus (FBR MSV), Moloney murine sarcoma virus (Mo-MSV), Abelson murine leukemia virus (A-MLV), Avian myelocytomatosis virus-29 (MC29), and Avian erythroblastosis virus (AEV). A detailed list of retroviruses may be found in Coffin et al “Retroviruses” 1997 Cold Spring Harbour Laboratory Press Eds: J M Coffin, S M Hughes, H E Varmus pp 758-763).

[0370] Preferred vectors for use in accordance with the present invention are recombinant viral vectors, in particular recombinant retroviral vectors (RRV) such as lentiviral vectors. Lentiviral vectors are able to deliver genes to non-dividing, terminally differentiated cells.

[0371] The term “recombinant retroviral vector” (RRV) refers to a vector with sufficient retroviral genetic information to allow packaging of an RNA genome, in the presence of packaging components, into a viral particle capable of infecting a target cell. Infection of the target cell includes reverse transcription and integration into the target cell genome. The RRV carries non-viral coding sequences which are to be delivered by the vector to the target cell. An RRV is incapable of independent replication to produce infectious retroviral particles within the final target cell. Usually the RRV lacks a functional gag-pol and/or env gene and/or other genes essential for replication.

[0372] Lentiviral genomes can be quite variable. For example there are many quasi-species of HIV-1 which are still functional. This is also the case for EIAV. These variants may be used to enhance particular parts of the transduction process. Examples of HIV-1 variants may be found at http://hiv-web.lanl.gov. Details of EIAV clones may be found at the NCBI database: http://www.ncbi.nim.nih.qov.

[0373] EIAV vectors have been shown to deliver genes very efficiently to a number of neuronal cell types in vitro and in vivo. Gene expression has been sustained for a number of months in vivo, with little or no immunological reaction. Thus, according to the present invention EIAV vectors are a suitable delivery system to direct expression of RARβ2 in the human peripheral and central nervous systems and such systems are discussed in detail herein.

[0374] Vector titre may be estimated by infection assays. For example, infections could be carried out with vector preparation in question, and antibody staining for the product of the nucleotide of interest could be used to determine the proportion of productively infected cells, giving an indication of the titre of the vector preparation. For example, antibodies directed against RARβ2 are commercially available and may be advantageously utilised for this purpose according to the manufacturers' instructions. Alternatively, a PCR approach may be used, by amplifying using primers directed at the nucleotide of interest delivered by the vector, such as a nucleotide sequence directing the expression of RARβ2. Primers may advantageously be designed to include or comprise vector sequence(s) in order to ensure that the relevant amplification product has indeed originated from the sequence in question. Other ways in which vector titre may be estimated are known in the art, and are discussed in the Examples section hereinbelow.

[0375] Non-Viral Delivery

[0376] The pharmaceutically active agent (e.g. the RARβ2) may be administered using non-viral techniques.

[0377] By way of example, the pharmaceutically active agent may be delivered using peptide delivery. Peptide delivery uses domains or sequences from proteins capable of translocation through the plasma and/or nuclear membrane

[0378] Polypeptides of interest such as RARβ2 may be directly introduced to the cell by microinjection, or delivery using vesicles such as liposomes which are capable of fusing with the cell membrane. Viral fusogenic peptides may also be used to promote membrane fusion and delivery to the cytoplasm of the cell.

[0379] Preferably, the RARβ2 or fragment(s) thereof may be delivered into cells as protein fusions or conjugates with a protein capable of crossing the plasma membrane and/or the nuclear membrane. Preferably, the RARβ2 or fragment(s) thereof is fused or conjugated to a domain or sequence from such a protein responsible for the translocational activity. Preferred translocation domains and sequences include domains and sequences from the HIV-1-trans-activating protein (Tat), Drosophila Antennapedia homeodomain protein and the herpes simplex-1 virus VP22 protein.

[0380] Exogenously added HIV-1-trans-activating protein (Tat) can translocate through the plasma membrane and to reach the nucleus to transactivate the viral genome. Translocational activity has been identified in amino acids 37-72 (Fawell et al., 1994, Proc. Natl. Acad. Sci. U. S. A. 91, 664-668), 37-62 (Anderson et al., 1993, Biochem. Biophys. Res. Commun. 194, 876-884) and 49-58 (having the basic sequence RKKRRQRRR) of HIV-Tat. Vives et al. (1997), J Biol Chem 272, 16010-7 identified a sequence consisting of amino acids 48-60 (CGRKKRRQRRRPPQC), which appears to be important for translocation, nuclear localisation and trans-activation of cellular genes. The third helix of the Drosophila Antennapedia homeodomain protein has also been shown to possess similar properties (reviewed in Prochiantz, A., 1999, Ann N Y Acad Sci, 886, 172-9). The domain responsible for translocation in Antennapedia has been localised to a 16 amino acid long peptide rich in basic amino acids having the sequence RQIKIWFQNRRMKWKK (Derossi, et al., 1994, J Biol Chem, 269, 10444-50). This peptide has been used to direct biologically active substances to the cytoplasm and nucleus of cells in culture (Theodore, et al., 1995, J Neurosci 15, 7158-7167). The VP22 tegument protein of herpes simplex virus is capable of intercellular transport, in which VP22 protein expressed in a subpopulation of cells spreads to other cells in the population (Elliot and O'Hare, 1997, Cell 88, 223-33). Fusion proteins consisting of GFP (Elliott and O'Hare, 1999, Gene Ther 6, 149-51), thymidine kinase protein (Dilber et al., 1999, Gene Ther 6, 12-21) or p53 (Phelan et al., 1998, Nat Biotechnol 16, 440-3) with VP22 have been targeted to cells in this manner. Any of the domains or sequences as set out above may be used to direct RARβ2 or fragment(s) thereof into cell(s). Any of the domains or sequences as set out above, or others identified as having translocational activity, may be used to direct the RARβ2 or fragment(s) thereof into a cell.

[0381] Pharmaceutical Compositions

[0382] The present invention also provides a pharmaceutical composition comprising administering a therapeutically effective amount of the agent of the present invention (such as RARβ2 and/or an agonist thereof as discussed herein) and a pharmaceutically acceptable carrier, diluent or excipients (including combinations thereof).

1 73 1 9 PRT Human immunodeficiency virus type 1 1 Arg Lys Lys Arg Arg Gln Arg Arg Arg 1 5 2 15 PRT Human immunodeficiency virus type 1 2 Cys Gly Arg Lys Lys Arg Arg Gln Arg Arg Arg Pro Pro Gln Cys 1 5 10 15 3 16 PRT Drosophila sp. 3 Arg Gln Ile Lys Ile Trp Phe Gln Asn Arg Arg Met Lys Trp Lys Lys 1 5 10 15 4 10998 DNA Artificial Sequence Description of Artificial Sequence pONY8.0Z vector genome plasmid 4 agatcttgaa taataaaatg tgtgtttgtc cgaaatacgc gttttgagat ttctgtcgcc 60 gactaaattc atgtcgcgcg atagtggtgt ttatcgccga tagagatggc gatattggaa 120 aaattgatat ttgaaaatat ggcatattga aaatgtcgcc gatgtgagtt tctgtgtaac 180 tgatatcgcc atttttccaa aagtgatttt tgggcatacg cgatatctgg cgatagcgct 240 tatatcgttt acgggggatg gcgatagacg actttggtga cttgggcgat tctgtgtgtc 300 gcaaatatcg cagtttcgat ataggtgaca gacgatatga ggctatatcg ccgatagagg 360 cgacatcaag ctggcacatg gccaatgcat atcgatctat acattgaatc aatattggcc 420 attagccata ttattcattg gttatatagc ataaatcaat attggctatt ggccattgca 480 tacgttgtat ccatatcgta atatgtacat ttatattggc tcatgtccaa cattaccgcc 540 atgttgacat tgattattga ctagttatta atagtaatca attacggggt cattagttca 600 tagcccatat atggagttcc gcgttacata acttacggta aatggcccgc ctggctgacc 660 gcccaacgac ccccgcccat tgacgtcaat aatgacgtat gttcccatag taacgccaat 720 agggactttc cattgacgtc aatgggtgga gtatttacgg taaactgccc acttggcagt 780 acatcaagtg tatcatatgc caagtccgcc ccctattgac gtcaatgacg gtaaatggcc 840 cgcctggcat tatgcccagt acatgacctt acgggacttt cctacttggc agtacatcta 900 cgtattagtc atcgctatta ccatggtgat gcggttttgg cagtacacca atgggcgtgg 960 atagcggttt gactcacggg gatttccaag tctccacccc attgacgtca atgggagttt 1020 gttttggcac caaaatcaac gggactttcc aaaatgtcgt aacaactgcg atcgcccgcc 1080 ccgttgacgc aaatgggcgg taggcgtgta cggtgggagg tctatataag cagagctcgt 1140 ttagtgaacc gggcactcag attctgcggt ctgagtccct tctctgctgg gctgaaaagg 1200 cctttgtaat aaatataatt ctctactcag tccctgtctc tagtttgtct gttcgagatc 1260 ctacagttgg cgcccgaaca gggacctgag aggggcgcag accctacctg ttgaacctgg 1320 ctgatcgtag gatccccggg acagcagagg agaacttaca gaagtcttct ggaggtgttc 1380 ctggccagaa cacaggagga caggtaagat tgggagaccc tttgacattg gagcaaggcg 1440 ctcaagaagt tagagaaggt gacggtacaa gggtctcaga aattaactac tggtaactgt 1500 aattgggcgc taagtctagt agacttattt catgatacca actttgtaaa agaaaaggac 1560 tggcagctga gggatgtcat tccattgctg gaagatgtaa ctcagacgct gtcaggacaa 1620 gaaagagagg cctttgaaag aacatggtgg gcaatttctg ctgtaaagat gggcctccag 1680 attaataatg tagtagatgg aaaggcatca ttccagctcc taagagcgaa atatgaaaag 1740 aagactgcta ataaaaagca gtctgagccc tctgaagaat atctctagaa ctagtggatc 1800 ccccgggctg caggagtggg gaggcacgat ggccgctttg gtcgaggcgg atccggccat 1860 tagccatatt attcattggt tatatagcat aaatcaatat tggctattgg ccattgcata 1920 cgttgtatcc atatcataat atgtacattt atattggctc atgtccaaca ttaccgccat 1980 gttgacattg attattgact agttattaat agtaatcaat tacggggtca ttagttcata 2040 gcccatatat ggagttccgc gttacataac ttacggtaaa tggcccgcct ggctgaccgc 2100 ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt tcccatagta acgccaatag 2160 ggactttcca ttgacgtcaa tgggtggagt atttacggta aactgcccac ttggcagtac 2220 atcaagtgta tcatatgcca agtacgcccc ctattgacgt caatgacggt aaatggcccg 2280 cctggcatta tgcccagtac atgaccttat gggactttcc tacttggcag tacatctacg 2340 tattagtcat cgctattacc atggtgatgc ggttttggca gtacatcaat gggcgtggat 2400 agcggtttga ctcacgggga tttccaagtc tccaccccat tgacgtcaat gggagtttgt 2460 tttggcacca aaatcaacgg gactttccaa aatgtcgtaa caactccgcc ccattgacgc 2520 aaatgggcgg taggcatgta cggtgggagg tctatataag cagagctcgt ttagtgaacc 2580 gtcagatcgc ctggagacgc catccacgct gttttgacct ccatagaaga caccgggacc 2640 gatccagcct ccgcggcccc aagcttcagc tgctcgagga tctgcggatc cggggaattc 2700 cccagtctca ggatccacca tgggggatcc cgtcgtttta caacgtcgtg actgggaaaa 2760 ccctggcgtt acccaactta atcgccttgc agcacatccc cctttcgcca gctggcgtaa 2820 tagcgaagag gcccgcaccg atcgcccttc ccaacagttg cgcagcctga atggcgaatg 2880 gcgctttgcc tggtttccgg caccagaagc ggtgccggaa agctggctgg agtgcgatct 2940 tcctgaggcc gatactgtcg tcgtcccctc aaactggcag atgcacggtt acgatgcgcc 3000 catctacacc aacgtaacct atcccattac ggtcaatccg ccgtttgttc ccacggagaa 3060 tccgacgggt tgttactcgc tcacatttaa tgttgatgaa agctggctac aggaaggcca 3120 gacgcgaatt atttttgatg gcgttaactc ggcgtttcat ctgtggtgca acgggcgctg 3180 ggtcggttac ggccaggaca gtcgtttgcc gtctgaattt gacctgagcg catttttacg 3240 cgccggagaa aaccgcctcg cggtgatggt gctgcgttgg agtgacggca gttatctgga 3300 agatcaggat atgtggcgga tgagcggcat tttccgtgac gtctcgttgc tgcataaacc 3360 gactacacaa atcagcgatt tccatgttgc cactcgcttt aatgatgatt tcagccgcgc 3420 tgtactggag gctgaagttc agatgtgcgg cgagttgcgt gactacctac gggtaacagt 3480 ttctttatgg cagggtgaaa cgcaggtcgc cagcggcacc gcgcctttcg gcggtgaaat 3540 tatcgatgag cgtggtggtt atgccgatcg cgtcacacta cgtctgaacg tcgaaaaccc 3600 gaaactgtgg agcgccgaaa tcccgaatct ctatcgtgcg gtggttgaac tgcacaccgc 3660 cgacggcacg ctgattgaag cagaagcctg cgatgtcggt ttccgcgagg tgcggattga 3720 aaatggtctg ctgctgctga acggcaagcc gttgctgatt cgaggcgtta accgtcacga 3780 gcatcatcct ctgcatggtc aggtcatgga tgagcagacg atggtgcagg atatcctgct 3840 gatgaagcag aacaacttta acgccgtgcg ctgttcgcat tatccgaacc atccgctgtg 3900 gtacacgctg tgcgaccgct acggcctgta tgtggtggat gaagccaata ttgaaaccca 3960 cggcatggtg ccaatgaatc gtctgaccga tgatccgcgc tggctaccgg cgatgagcga 4020 acgcgtaacg cgaatggtgc agcgcgatcg taatcacccg agtgtgatca tctggtcgct 4080 ggggaatgaa tcaggccacg gcgctaatca cgacgcgctg tatcgctgga tcaaatctgt 4140 cgatccttcc cgcccggtgc agtatgaagg cggcggagcc gacaccacgg ccaccgatat 4200 tatttgcccg atgtacgcgc gcgtggatga agaccagccc ttcccggctg tgccgaaatg 4260 gtccatcaaa aaatggcttt cgctacctgg agagacgcgc ccgctgatcc tttgcgaata 4320 cgcccacgcg atgggtaaca gtcttggcgg tttcgctaaa tactggcagg cgtttcgtca 4380 gtatccccgt ttacagggcg gcttcgtctg ggactgggtg gatcagtcgc tgattaaata 4440 tgatgaaaac ggcaacccgt ggtcggctta cggcggtgat tttggcgata cgccgaacga 4500 tcgccagttc tgtatgaacg gtctggtctt tgccgaccgc acgccgcatc cagcgctgac 4560 ggaagcaaaa caccagcagc agtttttcca gttccgttta tccgggcaaa ccatcgaagt 4620 gaccagcgaa tacctgttcc gtcatagcga taacgagctc ctgcactgga tggtggcgct 4680 ggatggtaag ccgctggcaa gcggtgaagt gcctctggat gtcgctccac aaggtaaaca 4740 gttgattgaa ctgcctgaac taccgcagcc ggagagcgcc gggcaactct ggctcacagt 4800 acgcgtagtg caaccgaacg cgaccgcatg gtcagaagcc gggcacatca gcgcctggca 4860 gcagtggcgt ctggcggaaa acctcagtgt gacgctcccc gccgcgtccc acgccatccc 4920 gcatctgacc accagcgaaa tggatttttg catcgagctg ggtaataagc gttggcaatt 4980 taaccgccag tcaggctttc tttcacagat gtggattggc gataaaaaac aactgctgac 5040 gccgctgcgc gatcagttca cccgtgcacc gctggataac gacattggcg taagtgaagc 5100 gacccgcatt gaccctaacg cctgggtcga acgctggaag gcggcgggcc attaccaggc 5160 cgaagcagcg ttgttgcagt gcacggcaga tacacttgct gatgcggtgc tgattacgac 5220 cgctcacgcg tggcagcatc aggggaaaac cttatttatc agccggaaaa cctaccggat 5280 tgatggtagt ggtcaaatgg cgattaccgt tgatgttgaa gtggcgagcg atacaccgca 5340 tccggcgcgg attggcctga actgccagct ggcgcaggta gcagagcggg taaactggct 5400 cggattaggg ccgcaagaaa actatcccga ccgccttact gccgcctgtt ttgaccgctg 5460 ggatctgcca ttgtcagaca tgtatacccc gtacgtcttc ccgagcgaaa acggtctgcg 5520 ctgcgggacg cgcgaattga attatggccc acaccagtgg cgcggcgact tccagttcaa 5580 catcagccgc tacagtcaac agcaactgat ggaaaccagc catcgccatc tgctgcacgc 5640 ggaagaaggc acatggctga atatcgacgg tttccatatg gggattggtg gcgacgactc 5700 ctggagcccg tcagtatcgg cggaattcca gctgagcgcc ggtcgctacc attaccagtt 5760 ggtctggtgt caaaaataat aataaccggg caggggggat ccgcagatcc ggctgtggaa 5820 tgtgtgtcag ttagggtgtg gaaagtcccc aggctcccca gcaggcagaa gtatgcaaag 5880 catgcctgca ggaattcgat atcaagctta tcgataccgt cgacctcgag ggggggcccg 5940 gtacccagct tttgttccct ttagtgaggg ttaattgcgc gggaagtatt tatcactaat 6000 caagcacaag taatacatga gaaactttta ctacagcaag cacaatcctc caaaaaattt 6060 tgtttttaca aaatccctgg tgaacatgat tggaagggac ctactagggt gctgtggaag 6120 ggtgatggtg cagtagtagt taatgatgaa ggaaagggaa taattgctgt accattaacc 6180 aggactaagt tactaataaa accaaattga gtattgttgc aggaagcaag acccaactac 6240 cattgtcagc tgtgtttcct gacctcaata tttgttataa ggtttgatat gaatcccagg 6300 gggaatctca acccctatta cccaacagtc agaaaaatct aagtgtgagg agaacacaat 6360 gtttcaacct tattgttata ataatgacag taagaacagc atggcagaat cgaaggaagc 6420 aagagaccaa gaatgaacct gaaagaagaa tctaaagaag aaaaaagaag aaatgactgg 6480 tggaaaatag gtatgtttct gttatgctta gcaggaacta ctggaggaat actttggtgg 6540 tatgaaggac tcccacagca acattatata gggttggtgg cgataggggg aagattaaac 6600 ggatctggcc aatcaaatgc tatagaatgc tggggttcct tcccggggtg tagaccattt 6660 caaaattact tcagttatga gaccaataga agcatgcata tggataataa tactgctaca 6720 ttattagaag ctttaaccaa tataactgct ctataaataa caaaacagaa ttagaaacat 6780 ggaagttagt aaagacttct ggcataactc ctttacctat ttcttctgaa gctaacactg 6840 gactaattag acataagaga gattttggta taagtgcaat agtggcagct attgtagccg 6900 ctactgctat tgctgctagc gctactatgt cttatgttgc tctaactgag gttaacaaaa 6960 taatggaagt acaaaatcat acttttgagg tagaaaatag tactctaaat ggtatggatt 7020 taatagaacg acaaataaag atattatatg ctatgattct tcaaacacat gcagatgttc 7080 aactgttaaa ggaaagacaa caggtagagg agacatttaa tttaattgga tgtatagaaa 7140 gaacacatgt attttgtcat actggtcatc cctggaatat gtcatgggga catttaaatg 7200 agtcaacaca atgggatgac tgggtaagca aaatggaaga tttaaatcaa gagatactaa 7260 ctacacttca tggagccagg aacaatttgg cacaatccat gataacattc aatacaccag 7320 atagtatagc tcaatttgga aaagaccttt ggagtcatat tggaaattgg attcctggat 7380 tgggagcttc cattataaaa tatatagtga tgtttttgct tatttatttg ttactaacct 7440 cttcgcctaa gatcctcagg gccctctgga aggtgaccag tggtgcaggg tcctccggca 7500 gtcgttacct gaagaaaaaa ttccatcaca aacatgcatc gcgagaagac acctgggacc 7560 aggcccaaca caacatacac ctagcaggcg tgaccggtgg atcaggggac aaatactaca 7620 agcagaagta ctccaggaac gactggaatg gagaatcaga ggagtacaac aggcggccaa 7680 agagctgggt gaagtcaatc gaggcatttg gagagagcta tatttccgag aagaccaaag 7740 gggagatttc tcagcctggg gcggctatca acgagcacaa gaacggctct ggggggaaca 7800 atcctcacca agggtcctta gacctggaga ttcgaagcga aggaggaaac atttatgact 7860 gttgcattaa agcccaagaa ggaactctcg ctatcccttg ctgtggattt cccttatggc 7920 tattttgggg actagtaatt atagtaggac gcatagcagg ctatggatta cgtggactcg 7980 ctgttataat aaggatttgt attagaggct taaatttgat atttgaaata atcagaaaaa 8040 tgcttgatta tattggaaga gctttaaatc ctggcacatc tcatgtatca atgcctcagt 8100 atgtttagaa aaacaagggg ggaactgtgg ggtttttatg aggggtttta taaatgatta 8160 taagagtaaa aagaaagttg ctgatgctct cataaccttg tataacccaa aggactagct 8220 catgttgcta ggcaactaaa ccgcaataac cgcatttgtg acgcgagttc cccattggtg 8280 acgcgttaac ttcctgtttt tacagtatat aagtgcttgt attctgacaa ttgggcactc 8340 agattctgcg gtctgagtcc cttctctgct gggctgaaaa ggcctttgta ataaatataa 8400 ttctctactc agtccctgtc tctagtttgt ctgttcgaga tcctacagag ctcatgcctt 8460 ggcgtaatca tggtcatagc tgtttcctgt gtgaaattgt tatccgctca caattccaca 8520 caacatacga gccggaagca taaagtgtaa agcctggggt gcctaatgag tgagctaact 8580 cacattaatt gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt cgtgccagct 8640 gcattaatga atcggccaac gcgcggggag aggcggtttg cgtattgggc gctcttccgc 8700 ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca 8760 ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg 8820 agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca 8880 taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa 8940 cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc 9000 tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc 9060 gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct 9120 gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg 9180 tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag 9240 gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta 9300 cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg 9360 aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt 9420 tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt 9480 ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag 9540 attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat 9600 ctaaagtata tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc 9660 tatctcagcg atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat 9720 aactacgata cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc 9780 acgctcaccg gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag 9840 aagtggtcct gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag 9900 agtaagtagt tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt 9960 ggtgtcacgc tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg 10020 agttacatga tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt 10080 tgtcagaagt aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc 10140 tcttactgtc atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc 10200 attctgagaa tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa 10260 taccgcgcca catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg 10320 aaaactctca aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc 10380 caactgatct tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag 10440 gcaaaatgcc gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt 10500 cctttttcaa tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt 10560 tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc 10620 acctaaattg taagcgttaa tattttgtta aaattcgcgt taaatttttg ttaaatcagc 10680 tcatttttta accaataggc cgaaatcggc aaaatccctt ataaatcaaa agaatagacc 10740 gagatagggt tgagtgttgt tccagtttgg aacaagagtc cactattaaa gaacgtggac 10800 tccaacgtca aagggcgaaa aaccgtctat cagggcgatg gcccactacg tgaaccatca 10860 ccctaatcaa gttttttggg gtcgaggtgc cgtaaagcac taaatcggaa ccctaaaggg 10920 agcccccgat ttagagcttg acggggaaag ccaacctggc ttatcgaaat taatacgact 10980 cactataggg agaccggc 10998 5 12481 DNA Artificial Sequence Description of Artificial Sequence pONY3.1, EIAV gag/pol expression plasmid 5 tcaatattgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 60 ttggccattg catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120 aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180 gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240 gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300 agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360 ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420 cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480 gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 540 caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600 caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaacaactg 660 cgatcgcccg ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga ggtctatata 720 agcagagctc gtttagtgaa ccgtcagatc actagaagct ttattgcggt agtttatcac 780 agttaaattg ctaacgcagt cagtgcttct gacacaacag tctcgaactt aagctgcagt 840 gactctctta aggtagcctt gcagaagttg gtcgtgaggc actgggcagg taagtatcaa 900 ggttacaaga caggtttaag gagaccaata gaaactgggc ttgtcgagac agagaagact 960 cttgcgtttc tgataggcac ctattggtct tactgacatc cactttgcct ttctctccac 1020 aggtgtccac tcccagttca attacagctc ttaaggctag agtacttaat acgactcact 1080 ataggctagc ctcgaggtcg acggtatcgc ccgaacaggg acctgagagg ggcgcagacc 1140 ctacctgttg aacctggctg atcgtaggat ccccgggaca gcagaggaga acttacagaa 1200 gtcttctgga ggtgttcctg gccagaacac aggaggacag gtaagatggg agaccctttg 1260 acatggagca aggcgctcaa gaagttagag aaggtgacgg tacaagggtc tcagaaatta 1320 actactggta actgtaattg ggcgctaagt ctagtagact tatttcatga taccaacttt 1380 gtaaaagaaa aggactggca gctgagggat gtcattccat tgctggaaga tgtaactcag 1440 acgctgtcag gacaagaaag agaggccttt gaaagaacat ggtgggcaat ttctgctgta 1500 aagatgggcc tccagattaa taatgtagta gatggaaagg catcattcca gctcctaaga 1560 gcgaaatatg aaaagaagac tgctaataaa aagcagtctg agccctctga agaatatcca 1620 atcatgatag atggggctgg aaacagaaat tttagacctc taacacctag aggatatact 1680 acttgggtga ataccataca gacaaatggt ctattaaatg aagctagtca aaacttattt 1740 gggatattat cagtagactg tacttctgaa gaaatgaatg catttttgga tgtggtacct 1800 ggccaggcag gacaaaagca gatattactt gatgcaattg ataagatagc agatgattgg 1860 gataatagac atccattacc gaatgctcca ctggtggcac caccacaagg gcctattccc 1920 atgacagcaa ggtttattag aggtttagga gtacctagag aaagacagat ggagcctgct 1980 tttgatcagt ttaggcagac atatagacaa tggataatag aagccatgtc agaaggcatc 2040 aaagtgatga ttggaaaacc taaagctcaa aatattaggc aaggagctaa ggaaccttac 2100 ccagaatttg tagacagact attatcccaa ataaaaagtg agggacatcc acaagagatt 2160 tcaaaattct tgactgatac actgactatt cagaacgcaa atgaggaatg tagaaatgct 2220 atgagacatt taagaccaga ggatacatta gaagagaaaa tgtatgcttg cagagacatt 2280 ggaactacaa aacaaaagat gatgttattg gcaaaagcac ttcagactgg tcttgcgggc 2340 ccatttaaag gtggagcctt gaaaggaggg ccactaaagg cagcacaaac atgttataac 2400 tgtgggaagc caggacattt atctagtcaa tgtagagcac ctaaagtctg ttttaaatgt 2460 aaacagcctg gacatttctc aaagcaatgc agaagtgttc caaaaaacgg gaagcaaggg 2520 gctcaaggga ggccccagaa acaaactttc ccgatacaac agaagagtca gcacaacaaa 2580 tctgttgtac aagagactcc tcagactcaa aatctgtacc cagatctgag cgaaataaaa 2640 aaggaataca atgtcaagga gaaggatcaa gtagaggatc tcaacctgga cagtttgtgg 2700 gagtaacata taatctagag aaaaggccta ctacaatagt attaattaat gatactccct 2760 taaatgtact gttagacaca ggagcagata cttcagtgtt gactactgca cattataata 2820 ggttaaaata tagagggaga aaatatcaag ggacgggaat aataggagtg ggaggaaatg 2880 tggaaacatt ttctacgcct gtgactataa agaaaaaggg tagacacatt aagacaagaa 2940 tgctagtggc agatattcca gtgactattt tgggacgaga tattcttcag gacttaggtg 3000 caaaattggt tttggcacag ctctccaagg aaataaaatt tagaaaaata gagttaaaag 3060 agggcacaat ggggccaaaa attcctcaat ggccactcac taaggagaaa ctagaagggg 3120 ccaaagagat agtccaaaga ctattgtcag agggaaaaat atcagaagct agtgacaata 3180 atccttataa ttcacccata tttgtaataa aaaagaggtc tggcaaatgg aggttattac 3240 aagatctgag agaattaaac aaaacagtac aagtaggaac ggaaatatcc agaggattgc 3300 ctcacccggg aggattaatt aaatgtaaac acatgactgt attagatatt ggagatgcat 3360 atttcactat acccttagat ccagagttta gaccatatac agctttcact attccctcca 3420 ttaatcatca agaaccagat aaaagatatg tgtggaaatg tttaccacaa ggattcgtgt 3480 tgagcccata tatatatcag aaaacattac aggaaatttt acaacctttt agggaaagat 3540 atcctgaagt acaattgtat caatatatgg atgatttgtt catgggaagt aatggttcta 3600 aaaaacaaca caaagagtta atcatagaat taagggcgat cttactggaa aagggttttg 3660 agacaccaga tgataaatta caagaagtgc caccttatag ctggctaggt tatcaacttt 3720 gtcctgaaaa ttggaaagta caaaaaatgc aattagacat ggtaaagaat ccaaccctta 3780 atgatgtgca aaaattaatg gggaatataa catggatgag ctcagggatc ccagggttga 3840 cagtaaaaca cattgcagct actactaagg gatgtttaga gttgaatcaa aaagtaattt 3900 ggacggaaga ggcacaaaaa gagttagaag aaaataatga gaagattaaa aatgctcaag 3960 ggttacaata ttataatcca gaagaagaaa tgttatgtga ggttgaaatt acaaaaaatt 4020 atgaggcaac ttatgttata aaacaatcac aaggaatcct atgggcaggt aaaaagatta 4080 tgaaggctaa taagggatgg tcaacagtaa aaaatttaat gttattgttg caacatgtgg 4140 caacagaaag tattactaga gtaggaaaat gtccaacgtt taaggtacca tttaccaaag 4200 agcaagtaat gtgggaaatg caaaaaggat ggtattattc ttggctccca gaaatagtat 4260 atacacatca agtagttcat gatgattgga gaatgaaatt ggtagaagaa cctacatcag 4320 gaataacaat atacactgat gggggaaaac aaaatggaga aggaatagca gcttatgtga 4380 ccagtaatgg gagaactaaa cagaaaaggt taggacctgt cactcatcaa gttgctgaaa 4440 gaatggcaat acaaatggca ttagaggata ccagagataa acaagtaaat atagtaactg 4500 atagttatta ttgttggaaa aatattacag aaggattagg tttagaagga ccacaaagtc 4560 cttggtggcc tataatacaa aatatacgag aaaaagagat agtttatttt gcttgggtac 4620 ctggtcacaa agggatatat ggtaatcaat tggcagatga agccgcaaaa ataaaagaag 4680 aaatcatgct agcataccaa ggcacacaaa ttaaagagaa aagagatgaa gatgcagggt 4740 ttgacttatg tgttccttat gacatcatga tacctgtatc tgacacaaaa atcataccca 4800 cagatgtaaa aattcaagtt cctcctaata gctttggatg ggtcactggg aaatcatcaa 4860 tggcaaaaca ggggttatta attaatggag gaataattga tgaaggatat acaggagaaa 4920 tacaagtgat atgtactaat attggaaaaa gtaatattaa attaatagag ggacaaaaat 4980 ttgcacaatt aattatacta cagcatcact caaattccag acagccttgg gatgaaaata 5040 aaatatctca gagaggggat aaaggatttg gaagtacagg agtattctgg gtagaaaata 5100 ttcaggaagc acaagatgaa catgagaatt ggcatacatc accaaagata ttggcaagaa 5160 attataagat accattgact gtagcaaaac agataactca agaatgtcct cattgcacta 5220 agcaaggatc aggacctgca ggttgtgtca tgagatctcc taatcattgg caggcagatt 5280 gcacacattt ggacaataag ataatattga cttttgtaga gtcaaattca ggatacatac 5340 atgctacatt attgtcaaaa gaaaatgcat tatgtacttc attggctatt ttagaatggg 5400 caagattgtt ttcaccaaag tccttacaca cagataacgg cactaatttt gtggcagaac 5460 cagttgtaaa tttgttgaag ttcctaaaga tagcacatac cacaggaata ccatatcatc 5520 cagaaagtca gggtattgta gaaagggcaa ataggacctt gaaagagaag attcaaagtc 5580 atagagacaa cactcaaaca ctggaggcag ctttacaact tgctctcatt acttgtaaca 5640 aagggaggga aagtatggga ggacagacac catgggaagt atttatcact aatcaagcac 5700 aagtaataca tgagaaactt ttactacagc aagcacaatc ctccaaaaaa ttttgttttt 5760 acaaaatccc tggtgaacat gattggaagg gacctactag ggtgctgtgg aagggtgatg 5820 gtgcagtagt agttaatgat gaaggaaagg gaataattgc tgtaccatta accaggacta 5880 agttactaat aaaaccaaat tgagtattgt tgcaggaagc aagacccaac taccattgtc 5940 agctgtgttt cctgaggtct ctaggaattg attacctcga tgcttcatta aggaagaaga 6000 ataaacaaag actgaaggca atccaacaag gaagacaacc tcaatatttg ttataaggtt 6060 tgatatatgg gagtatttgg taaaggggta acatggtcag catcgcattc tatgggggaa 6120 tcccaggggg aatctcaacc cctattaccc aacagtcaga aaaatctaag tgtgaggaga 6180 acacaatgtt tcaaccttat tgttataata atgacagtaa gaacagcatg gcagaatcga 6240 aggaagcaag agaccaagaa atgaacctga aagaagaatc taaagaagaa aaaagaagaa 6300 atgactggtg gaaaataggt atgtttctgt tatgcttagc aggaactact ggaggaatac 6360 tttggtggta tgaaggactc ccacagcaac attatatagg gttggtggcg atagggggaa 6420 gattaaacgg atctggccaa tcaaatgcta tagaatgctg gggttccttc ccggggtgta 6480 gaccatttca aaattacttc agttatgaga ccaatagaag catgcatatg gataataata 6540 ctgctacatt attagaagct ttaaccaata taactgctct ataaataaca aaacagaatt 6600 agaaacatgg aagttagtaa agacttctgg cataactcct ttacctattt cttctgaagc 6660 taacactgga ctaattagac ataagagaga ttttggtata agtgcaatag tggcagctat 6720 tgtagccgct actgctattg ctgctagcgc tactatgtct tatgttgctc taactgaggt 6780 taacaaaata atggaagtac aaaatcatac ttttgaggta gaaaatagta ctctaaatgg 6840 tatggattta atagaacgac aaataaagat attatatgct atgattcttc aaacacatgc 6900 agatgttcaa ctgttaaagg aaagacaaca ggtagaggag acatttaatt taattggatg 6960 tatagaaaga acacatgtat tttgtcatac tggtcatccc tggaatatgt catggggaca 7020 tttaaatgag tcaacacaat gggatgactg ggtaagcaaa atggaagatt taaatcaaga 7080 gatactaact acacttcatg gagccaggaa caatttggca caatccatga taacattcaa 7140 tacaccagat agtatagctc aatttggaaa agacctttgg agtcatattg gaaattggat 7200 tcctggattg ggagcttcca ttataaaata tatagtgatg tttttgctta tttatttgtt 7260 actaacctct tcgcctaaga tcctcagggc cctctggaag gtgaccagtg gtgcagggtc 7320 ctccggcagt cgttacctga agaaaaaatt ccatcacaaa catgcatcgc gagaagacac 7380 ctgggaccag gcccaacaca acatacacct agcaggcgtg accggtggat caggggacaa 7440 atactacaag cagaagtact ccaggaacga ctggaatgga gaatcagagg agtacaacag 7500 gcggccaaag agctgggtga agtcaatcga ggcatttgga gagagctata tttccgagaa 7560 gaccaaaggg gagatttctc agcctggggc ggctatcaac gagcacaaga acggctctgg 7620 ggggaacaat cctcaccaag ggtccttaga cctggagatt cgaagcgaag gaggaaacat 7680 ttatgactgt tgcattaaag cccaagaagg aactctcgct atcccttgct gtggatttcc 7740 cttatggcta ttttggggac tagtaattat agtaggacgc atagcaggct atggattacg 7800 tggactcgct gttataataa ggatttgtat tagaggctta aatttgatat ttgaaataat 7860 cagaaaaatg cttgattata ttggaagagc tttaaatcct ggcacatctc atgtatcaat 7920 gcctcagtat gtttagaaaa acaagggggg aactgtgggg tttttatgag gggttttata 7980 aatgattata agagtaaaaa gaaagttgct gatgctctca taaccttgta taacccaaag 8040 gactagctca tgttgctagg caactaaacc gcaataaccg catttgtgac gcgagttccc 8100 cattggtgac gcgtggtacc tctagagtcg acccgggcgg ccgcttccct ttagtgaggg 8160 ttaatgcttc gagcagacat gataagatac attgatgagt ttggacaaac cacaactaga 8220 atgcagtgaa aaaaatgctt tatttgtgaa atttgtgatg ctattgcttt atttgtaacc 8280 attataagct gcaataaaca agttaacaac aacaattgca ttcattttat gtttcaggtt 8340 cagggggaga tgtgggaggt tttttaaagc aagtaaaacc tctacaaatg tggtaaaatc 8400 cgataaggat cgatccgggc tggcgtaata gcgaagaggc ccgcaccgat cgcccttccc 8460 aacagttgcg cagcctgaat ggcgaatgga cgcgccctgt agcggcgcat taagcgcggc 8520 gggtgtggtg gttacgcgca gcgtgaccgc tacacttgcc agcgccctag cgcccgctcc 8580 tttcgctttc ttcccttcct ttctcgccac gttcgccggc tttccccgtc aagctctaaa 8640 tcgggggctc cctttagggt tccgatttag agctttacgg cacctcgacc gcaaaaaact 8700 tgatttgggt gatggttcac gtagtgggcc atcgccctga tagacggttt ttcgcccttt 8760 gacgttggag tccacgttct ttaatagtgg actcttgttc caaactggaa caacactcaa 8820 ccctatctcg gtctattctt ttgatttata agggattttg ccgatttcgg cctattggtt 8880 aaaaaatgag ctgatttaac aaatatttaa cgcgaatttt aacaaaatat taacgtttac 8940 aatttcgcct gatgcggtat tttctcctta cgcatctgtg cggtatttca caccgcatac 9000 gcggatctgc gcagcaccat ggcctgaaat aacctctgaa agaggaactt ggttaggtac 9060 cttctgaggc ggaaagaacc agctgtggaa tgtgtgtcag ttagggtgtg gaaagtcccc 9120 aggctcccca gcaggcagaa gtatgcaaag catgcatctc aattagtcag caaccaggtg 9180 tggaaagtcc ccaggctccc cagcaggcag aagtatgcaa agcatgcatc tcaattagtc 9240 agcaaccata gtcccgcccc taactccgcc catcccgccc ctaactccgc ccagttccgc 9300 ccattctccg ccccatggct gactaatttt ttttatttat gcagaggccg aggccgcctc 9360 ggcctctgag ctattccaga agtagtgagg aggctttttt ggaggcctag gcttttgcaa 9420 aaagcttgat tcttctgaca caacagtctc gaacttaagg ctagagccac catgattgaa 9480 caagatggat tgcacgcagg ttctccggcc gcttgggtgg agaggctatt cggctatgac 9540 tgggcacaac agacaatcgg ctgctctgat gccgccgtgt tccggctgtc agcgcagggg 9600 cgcccggttc tttttgtcaa gaccgacctg tccggtgccc tgaatgaact gcaggacgag 9660 gcagcgcggc tatcgtggct ggccacgacg ggcgttcctt gcgcagctgt gctcgacgtt 9720 gtcactgaag cgggaaggga ctggctgcta ttgggcgaag tgccggggca ggatctcctg 9780 tcatctcacc ttgctcctgc cgagaaagta tccatcatgg ctgatgcaat gcggcggctg 9840 catacgcttg atccggctac ctgcccattc gaccaccaag cgaaacatcg catcgagcga 9900 gcacgtactc ggatggaagc cggtcttgtc gatcaggatg atctggacga agagcatcag 9960 gggctcgcgc cagccgaact gttcgccagg ctcaaggcgc gcatgcccga cggcgaggat 10020 ctcgtcgtga cccatggcga tgcctgcttg ccgaatatca tggtggaaaa tggccgcttt 10080 tctggattca tcgactgtgg ccggctgggt gtggcggacc gctatcagga catagcgttg 10140 gctacccgtg atattgctga agagcttggc ggcgaatggg ctgaccgctt cctcgtgctt 10200 tacggtatcg ccgctcccga ttcgcagcgc atcgccttct atcgccttct tgacgagttc 10260 ttctgagcgg gactctgggg ttcgaaatga ccgaccaagc gacgcccaac ctgccatcac 10320 gatggccgca ataaaatatc tttattttca ttacatctgt gtgttggttt tttgtgtgaa 10380 tcgatagcga taaggatccg cgtatggtgc actctcagta caatctgctc tgatgccgca 10440 tagttaagcc agccccgaca cccgccaaca cccgctgacg cgccctgacg ggcttgtctg 10500 ctcccggcat ccgcttacag acaagctgtg accgtctccg ggagctgcat gtgtcagagg 10560 ttttcaccgt catcaccgaa acgcgcgaga cgaaagggcc tcgtgatacg cctattttta 10620 taggttaatg tcatgataat aatggtttct tagacgtcag gtggcacttt tcggggaaat 10680 gtgcgcggaa cccctatttg tttatttttc taaatacatt caaatatgta tccgctcatg 10740 agacaataac cctgataaat gcttcaataa tattgaaaaa ggaagagtat gagtattcaa 10800 catttccgtg tcgcccttat tccctttttt gcggcatttt gccttcctgt ttttgctcac 10860 ccagaaacgc tggtgaaagt aaaagatgct gaagatcagt tgggtgcacg agtgggttac 10920 atcgaactgg atctcaacag cggtaagatc cttgagagtt ttcgccccga agaacgtttt 10980 ccaatgatga gcacttttaa agttctgcta tgtggcgcgg tattatcccg tattgacgcc 11040 gggcaagagc aactcggtcg ccgcatacac tattctcaga atgacttggt tgagtactca 11100 ccagtcacag aaaagcatct tacggatggc atgacagtaa gagaattatg cagtgctgcc 11160 ataaccatga gtgataacac tgcggccaac ttacttctga caacgatcgg aggaccgaag 11220 gagctaaccg cttttttgca caacatgggg gatcatgtaa ctcgccttga tcgttgggaa 11280 ccggagctga atgaagccat accaaacgac gagcgtgaca ccacgatgcc tgtagcaatg 11340 gcaacaacgt tgcgcaaact attaactggc gaactactta ctctagcttc ccggcaacaa 11400 ttaatagact ggatggaggc ggataaagtt gcaggaccac ttctgcgctc ggcccttccg 11460 gctggctggt ttattgctga taaatctgga gccggtgagc gtgggtctcg cggtatcatt 11520 gcagcactgg ggccagatgg taagccctcc cgtatcgtag ttatctacac gacggggagt 11580 caggcaacta tggatgaacg aaatagacag atcgctgaga taggtgcctc actgattaag 11640 cattggtaac tgtcagacca agtttactca tatatacttt agattgattt aaaacttcat 11700 ttttaattta aaaggatcta ggtgaagatc ctttttgata atctcatgac caaaatccct 11760 taacgtgagt tttcgttcca ctgagcgtca gaccccgtag aaaagatcaa aggatcttct 11820 tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca 11880 gcggtggttt gtttgccgga tcaagagcta ccaactcttt ttccgaaggt aactggcttc 11940 agcagagcgc agataccaaa tactgtcctt ctagtgtagc cgtagttagg ccaccacttc 12000 aagaactctg tagcaccgcc tacatacctc gctctgctaa tcctgttacc agtggctgct 12060 gccagtggcg ataagtcgtg tcttaccggg ttggactcaa gacgatagtt accggataag 12120 gcgcagcggt cgggctgaac ggggggttcg tgcacacagc ccagcttgga gcgaacgacc 12180 tacaccgaac tgagatacct acagcgtgag ctatgagaaa gcgccacgct tcccgaaggg 12240 agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa caggagagcg cacgagggag 12300 cttccagggg gaaacgcctg gtatctttat agtcctgtcg ggtttcgcca cctctgactt 12360 gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac 12420 gcggcctttt tacggttcct ggccttttgc tggccttttg ctcacatggc tcgacagatc 12480 t 12481 6 6845 DNA Artificial Sequence Description of Artificial Sequence pRV67, VSV-G expression plasmid 6 tcgacctgca ggatatcgaa ttcattgatc ataatcagcc ataccacatt tgtagaggtt 60 ttacttgctt taaaaaacct cccacacctc cccctgaacc tgaaacataa aatgaatgca 120 attgttgttg ttaacttgtt tattgcagct tataatggtt acaaataaag caatagcatc 180 acaaatttca caaataaagc atttttttca ctgcattcta gttgtggttt gtccaaactc 240 atcaatgtat cttatcatgt ctggatccgt accgagctcg cgtaatcatg tcatagctgt 300 ttcctgtgtg aaattgttat ccgctcacaa ttccacacaa catacgagcc ggaagcataa 360 agtgtaaagc ctggggtgcc taatgagtga gctaactaca ttaattgcgt tgcgctcack 420 gcccgctttc cartcgggaa acctgtcgtg ccagctgcat taatgaatcg gccaacgcgc 480 ggggagaggc ggtttgcgta ttgggcgctc ttccgcttcc tcgctcactg actcgctgcg 540 ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa tacggttatc 600 cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc aaaaggccag 660 gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca 720 tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca 780 ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg 840 atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag 900 gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt 960 tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca 1020 cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg 1080 cggtgctaca gagttcttga agtggtggcc taactacggc tacmctagaa gracagtatt 1140 tggkatctgs gcttctgytg aagmcagtta ccttcggaaa aagagttggt agctcttgat 1200 ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc 1260 gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacggggtct gacgctcagt 1320 ggaacgaaaa ctcacgttaa gggattttgg tcatgagatt atcaaaaagg atcttcacct 1380 agatcctttt aaattaaaaa tgaagtttta aatcaatcta aagtatatat gagtaaactt 1440 ggtctgacag ttaccaatgc ttaatcagtg aggcacctat ctcagcgatc tgtctatttc 1500 gttcatccat agttgcctga ctccccgtcg tgtagataac tacgatacgg gagggcttac 1560 catctggccc cagtgctgca atgataccgc gagacccacg ctcaccggct ccagatttat 1620 cagcaataaa ccagccagcc ggaagggccg agcgcagaag tggtcctgca actttatccg 1680 cctccatcca gtctattaat tgttgccggg aagctagagt aagtagttcg ccagttaata 1740 gtttgcgcaa cgttgttgcc attgctacag gcatcgtggt gtcacgctcg tcgtttggta 1800 tggcttcatt cagctccggt tcccaacgat caaggcgagt tacatgatcc cccatgttgt 1860 gcaaaaaagc ggttagctcc ttcggtcctc cgatcgttgt cagaagtaag ttggccgcag 1920 tgttatcact catggttatg gcagcactgc ataattctct tactgtcatg ccatccgtaa 1980 gatgcttttc tgtgactggt gagtactcaa ccaagtcatt ctgagaatag tgtatgcggc 2040 gaccgagttg ctcttgcccg gcgtcaatac gggataatac cgcgccacat agcagaactt 2100 taaaagtgct catcattgga aaacgttctt cggggcgaaa actctcaagg atcttaccgc 2160 tgttgagatc cagttcgatg taacccactc gtgcacccaa ctgatcttca gcatctttta 2220 ctttcaccag cgtttctggg tgagcaaaaa caggaaggca aaatgccgca aaaaagggaa 2280 taagggcgac acggaaatgt tgaatactca tactcttcct ttttcaataa gcggccgcgg 2340 ccatgccggc cactagtctc gagttattat tgaagcattt atcagggtta ttgtctcatg 2400 agcggataca tatttgaatg tatttagaaa aataaacaaa taggggttcc gcgcacattt 2460 ccccgaaaag tgccacctga cgtctaagaa accattatta tcatgacatt aacctataaa 2520 aataggcgta tcacgaggcc ctttcgtctc gcgcgtttcg gtgatgacgg tgaaaacctc 2580 tgacacatgc agctcccgga gacggtcaca gcttgtctgt aagcggatgc cgggagcaga 2640 caagcccgtc agggcgcgtc agcgggtgtt ggcgggtgtc ggggctggct taactatgcg 2700 gcatcagagc agattgtact gagagtgcac catatgaaga cgtcgcctcc tcactacttc 2760 tggaatagct cagaggccga ggcggcctcg gcctctgcat aaataaaaaa aatwaktcas 2820 ggcgccattc gccattcagg ctgcgcaact gttgggaagg gcgatcggtg cgggcctctt 2880 cgctattacg ccagctggcg aaagggggat gtgctgcaag gcgattaagt tgggtaacgc 2940 cagggttttc ccagtcacga cgttgtaaaa cgacggccag tgccatcgtg tcaaaggaca 3000 gtgactgcag tgaataataa aatgtgtgtt tgtccgaaat acgcgttttg agawttctgt 3060 cgccgactaa attcatgtcg cgcgatartg gtgtttatcg ccgatagaga tggcgatatt 3120 ggaaaaatcg atatttgaaa atatggcata ttgaaaatgt cgccgatgtg agtttctgtg 3180 taactgatat cgccattttt ccaaaagttg atttttgggc atacgcgata tctggcgata 3240 cgcttatatc gtttacgggg gatggcgata gacgcctttg gtgacttggg cgattctgtg 3300 tgtcgcaaat atcgcagttt cgatataggt gacagacgat atgaggctat atcgccgata 3360 gaggcgacat caagctggca catggccaat gcatatcgat ctatacattg aatcaatatt 3420 ggccattagc catattattc attggttata tagcataaat caatattggc tattggccat 3480 tgcatacgtt gtatccatat cataatatgt acatttatat tggctcatgt ccaacattac 3540 cgccatgttg acattgatta ttgactagtt attaatagta atcaattacg gggtcattag 3600 ttcatagccc atatatggag ttccgcgtta cataacttac ggtaaatggc ccgcctggct 3660 gaccgcccaa cgacccccgc ccattgacgt caataatgac gtatgttccc atagtaacgc 3720 caatagggac tttccattga cgtcaatggg tggagtattt acggtaaact gcccacttgg 3780 cagtacatca agtgtatcat atgccaagta cgccccctat tgacgtcaat gacggtaaat 3840 ggcccgcctg gcattatgcc cagtacatga ccttatggga ctttcctact tggcagtaca 3900 tctacgtatt agtcatcgct attaccatgg tgatgcggtt ttggcagtac atcaatgggc 3960 gtggatagcg gtttgactca cggggatttc caagtctcca ccccattgac gtcaatggga 4020 gtttgttttg gcaccaaaat caacgggact ttccaaaatg tcgtaacaac tccgccccat 4080 tgacgcaaat gggcggtagg cgtgtacggt gggaggtcta tataagcaga gctcgtttag 4140 tgaaccgtca gatcgcctgg agacgccatc cacgctgttt tgacctccat agaagacacc 4200 gggaccgatc cagcctccgc ggccgggaac ggtgcattgg aacgcggatt ccccgtgcca 4260 agagtgacgt aagtaccgcc tatagagtct ataggcccac ccccttggct tcttatgcat 4320 gctatactgt ttttggcttg gggtctatac acccccgctt cctcatgtta taggtgatgg 4380 tatagcttag cctataggtg tgggttattg accattattg accactcccc tattggtgac 4440 gatactttcc attactaatc cataacatgg ctctttgcac aactctcttt attggctata 4500 tgccaataca ctgtccttca gagactgaca cggactctgt atttttacag gatggggtct 4560 catttattat ttacaaattc acatatacaa caccaccgtc cccagtgccc gcagttttta 4620 ttaaacataa cgtgggatct ccagcgaatc tcgggtacgt gttccggaca tggggctctt 4680 ctccggtagc ggsggagytt ctacatccgr rsccttgytc ccatgcctcc asgrmttcat 4740 gktcgytcgg cagctccttg ctcctaamca gtggaggcca gacttaggca cagcacgatg 4800 cccaccacca ccagtgtgcc gcacaaggcc gtggcggtag ggtatgtgtc tgaaaatgag 4860 ctcggggagc gggcttgcac cgctggacgc atttggaaga cttaaggcag cggcagaaga 4920 agatgcaggc agctgagttg ttgtgttctg ataagagtca gaggtaactc ccgttgcggt 4980 gctgttaacg gtggagggca gtgtagtctg agcagtactc gttgctgccg cgcgcgccac 5040 cagacataat agctgacaga ctaacagact gttcctttcc atgggtcttt tctgcagtca 5100 ccgtccttga cacgaagctt cccgggatag gtacctcgcg agatccctcg aggaggaatt 5160 ctgacactat gaagtgcctt ttgtacttag cctttttatt cattggggtg aattgcaagt 5220 tcaccatagt ttttccacac aaccaaaaag gaaactggaa aaatgttcct tctaattacc 5280 attattgccc gtcaagctca gatttaaatt ggcataatga cttaataggc acagccttac 5340 aagtcaaaat gcccaagagt cacaaggcta ttcaagcaga cggttggatg tgtcatgctt 5400 ccaaatgggt cactacttgt gatttccgct ggtatggacc gaagtatata acacattcca 5460 tccgatcctt cactccatct gtagaacaat gcaaggaaag cattgaacaa acgaaacaag 5520 gaacttggct gaatccaggc ttccctcctc aaagttgtgg atatgcaact gtgacggatg 5580 ccgaagcagt gattgtccag gtgactcctc accatgtgct ggttgatgaa tacacaggag 5640 aatgggttga ttcacagttc atcaacggaa aatgcagcaa ttacatatgc cccactgtcc 5700 ataactctac aacctggcat tctgactata aggtcaaagg gctatgtgat tctaacctca 5760 tttccatgga catcaccttc ttctcagagg acggagagct atcatccctg ggaaaggagg 5820 gcacagggtt cagaagtaac tactttgctt atgaaactgg aggcaaggcc tgcaaaatgc 5880 aatactgcaa gcattgggga gtcagactcc catcaggtgt ctggttcgag atggctgata 5940 aggatctctt tgctgcagcc agattccctg aatgcccaga agggtcaagt atctctgctc 6000 catctcagac ctcagtggat gtaagtctaa ttcaggacgt tgagaggatc ttggattatt 6060 ccctctgcca agaaacctgg agcaaaatca gagcgggtct tccaatctct ccagtggatc 6120 tcagctatct tgctcctaaa aacccaggaa ccggtcctgc tttcaccata atcaatgggg 6180 gcctaaaata ctttgagacc agatacatca gagtcgatat tgctgctcca atcctctcaa 6240 gaatggtcgg aatgatcagt ggaactacca cagaaaggga actgtgggat gactgggcac 6300 catatgaaga cgtggaaatt ggacccaatg gagttctgag gaccagttca ggatataagt 6360 ttcctttata catgattgga catggtatgt tggactccga tcttcatctt agctcaaagg 6420 ctcaggtgtt cgaacatcct cacattcaag acgctgcttc gcaacttcct gatgatgaga 6480 gtttattttt tggtgatact gggctatcca aaaatccaat cgagcttgta gaaggttggt 6540 tcagtagttg gaaaagctct attgcctctt ttttctttat catagggtta atcattggac 6600 tattcttggt tctccgagtt ggtatccatc tttgcattaa attaaagcac accaagaaaa 6660 gacagattta tacagacata gagatgaacc gacttggaaa gtaactcaaa tcctgcacaa 6720 cagattcttc atgtttggac caaatcaact tgtgatacca tgctcaaaga ggcctcaatt 6780 atatttgagt ttttaatttt tatgaaaaaa aaaaaaaaaa acggaattcc tcgagggatc 6840 tagag 6845 7 1375 DNA Artificial Sequence Description of Artificial Sequence RARbeta2 PCR product 7 actgccgcgg gccaccatgt ttgactgtat ggatgttctg tcagtgagtc ccgggcagat 60 cctggatttc tacaccgcga gcccttcctc ctgcatgctg caggaaaagg ctctcaaagc 120 ctgcctcagt ggattcaccc aggccgaatg gcagcaccgg catactgctc aatccatcga 180 gacacagagt accagctctg aggagctcgt cccgagccca ccatctccac ttcctcctcc 240 tcgggtgtac aagccctgct tcgtttgcca ggacaagtca tcgggctacc actatggcgt 300 cagtgcctgc gaggggtgca agggcttttt ccgcagaagt attcagaaga acatgatcta 360 cacttgccat cgagataaga actgcgtcat taacaaggtc actaggaacc gatgccagta 420 ctgccgcctg cagaagtgct ttgaagtggg catgtccaaa gagtctgtta ggaatgacag 480 gaacaagaaa aagaaggagc cttcaaagca ggaatgcaca gagagctatg agatgacagc 540 ggagctagac gacctcactg agaagatccg gaaagcccac caggaaacct ttccctcact 600 ctgccagctg ggtaaataca ccacgaattc cagcgctgac caccgggtcc gattggactt 660 gggcctctgg gacaaattca gtgagctggc caccaagtgc attattaaga tcgtggagtt 720 cgccaagcgt ctgccgggct tcacaggtct gaccatcgca gaccagatca ccctgctcaa 780 agccgcctgc ttggatatct tgattctcag aatttgtacc aggtataccc cagagcaaga 840 caccatgact ttctctgatg gccttacact aaatcgaact cagatgcaca atgctggctt 900 cggtcctctg actgaccttg tgttcacctt tgccaaccag ctcctgcctt tggaaatgga 960 tgacacagaa acaggccttc tcagtgccat ctgtttaatc tgtggagacc gccaggacct 1020 tgaggaacca acaaaagtag acaagctcca agaaccactg ctggaagcac taaagattta 1080 cattagaaaa cgacgaccca gcaagcctca catgtttcca aagatcttaa tgaaaatcac 1140 agatctccgc agcatcagcg cgaaaggtgc cgaacgtgta attaccttga aaatggaaat 1200 tcctggatca atgccacctc tcattcagga aatgctggag aattctgaag gacatgaacc 1260 cttgacccca agttcaagtg ggaatatagc agagcacagt cccagcgtgt cccccagctc 1320 agtggagaac agtggagtca gtcagtcacc actgctgcag tgagcggccg ccagt 1375 8 1399 DNA Artificial Sequence Description of Artificial Sequence FLAG RARbeta2 PCR product 8 actgccgcgg gccaccatgg actacaagga cgacgatgac aagtttgact gtatggatgt 60 tctgtcagtg agtcccgggc agatcctgga tttctacacc gcgagccctt cctcctgcat 120 gctgcaggaa aaggctctca aagcctgcct cagtggattc acccaggccg aatggcagca 180 ccggcatact gctcaatcca tcgagacaca gagtaccagc tctgaggagc tcgtcccgag 240 cccaccatct ccacttcctc ctcctcgggt gtacaagccc tgcttcgttt gccaggacaa 300 gtcatcgggc taccactatg gcgtcagtgc ctgcgagggg tgcaagggct ttttccgcag 360 aagtattcag aagaacatga tctacacttg ccatcgagat aagaactgcg tcattaacaa 420 ggtcactagg aaccgatgcc agtactgccg cctgcagaag tgctttgaag tgggcatgtc 480 caaagagtct gttaggaatg acaggaacaa gaaaaagaag gagccttcaa agcaggaatg 540 cacagagagc tatgagatga cagcggagct agacgacctc actgagaaga tccggaaagc 600 ccaccaggaa acctttccct cactctgcca gctgggtaaa tacaccacga attccagcgc 660 tgaccaccgg gtccgattgg acttgggcct ctgggacaaa ttcagtgagc tggccaccaa 720 gtgcattatt aagatcgtgg agttcgccaa gcgtctgccg ggcttcacag gtctgaccat 780 cgcagaccag atcaccctgc tcaaagccgc ctgcttggat atcttgattc tcagaatttg 840 taccaggtat accccagagc aagacaccat gactttctct gatggcctta cactaaatcg 900 aactcagatg cacaatgctg gcttcggtcc tctgactgac cttgtgttca cctttgccaa 960 ccagctcctg cctttggaaa tggatgacac agaaacaggc cttctcagtg ccatctgttt 1020 aatctgtgga gaccgccagg accttgagga accaacaaaa gtagacaagc tccaagaacc 1080 actgctggaa gcactaaaga tttacattag aaaacgacga cccagcaagc ctcacatgtt 1140 tccaaagatc ttaatgaaaa tcacagatct ccgcagcatc agcgcgaaag gtgccgaacg 1200 tgtaattacc ttgaaaatgg aaattcctgg atcaatgcca cctctcattc aggaaatgct 1260 ggagaattct gaaggacatg aacccttgac cccaagttca agtgggaata tagcagagca 1320 cagtcccagc gtgtccccca gctcagtgga gaacagtgga gtcagtcagt caccactgct 1380 gcagtgagcg gccgccagt 1399 9 9127 DNA Artificial Sequence Description of Artificial Sequence pONY- RARbeta2 vector genome plasmid 9 agatcttgaa taataaaatg tgtgtttgtc cgaaatacgc gttttgagat ttctgtcgcc 60 gactaaattc atgtcgcgcg atagtggtgt ttatcgccga tagagatggc gatattggaa 120 aaattgatat ttgaaaatat ggcatattga aaatgtcgcc gatgtgagtt tctgtgtaac 180 tgatatcgcc atttttccaa aagtgatttt tgggcatacg cgatatctgg cgatagcgct 240 tatatcgttt acgggggatg gcgatagacg actttggtga cttgggcgat tctgtgtgtc 300 gcaaatatcg cagtttcgat ataggtgaca gacgatatga ggctatatcg ccgatagagg 360 cgacatcaag ctggcacatg gccaatgcat atcgatctat acattgaatc aatattggcc 420 attagccata ttattcattg gttatatagc ataaatcaat attggctatt ggccattgca 480 tacgttgtat ccatatcgta atatgtacat ttatattggc tcatgtccaa cattaccgcc 540 atgttgacat tgattattga ctagttatta atagtaatca attacggggt cattagttca 600 tagcccatat atggagttcc gcgttacata acttacggta aatggcccgc ctggctgacc 660 gcccaacgac ccccgcccat tgacgtcaat aatgacgtat gttcccatag taacgccaat 720 agggactttc cattgacgtc aatgggtgga gtatttacgg taaactgccc acttggcagt 780 acatcaagtg tatcatatgc caagtccgcc ccctattgac gtcaatgacg gtaaatggcc 840 cgcctggcat tatgcccagt acatgacctt acgggacttt cctacttggc agtacatcta 900 cgtattagtc atcgctatta ccatggtgat gcggttttgg cagtacacca atgggcgtgg 960 atagcggttt gactcacggg gatttccaag tctccacccc attgacgtca atgggagttt 1020 gttttggcac caaaatcaac gggactttcc aaaatgtcgt aacaactgcg atcgcccgcc 1080 ccgttgacgc aaatgggcgg taggcgtgta cggtgggagg tctatataag cagagctcgt 1140 ttagtgaacc gggcactcag attctgcggt ctgagtccct tctctgctgg gctgaaaagg 1200 cctttgtaat aaatataatt ctctactcag tccctgtctc tagtttgtct gttcgagatc 1260 ctacagttgg cgcccgaaca gggacctgag aggggcgcag accctacctg ttgaacctgg 1320 ctgatcgtag gatccccggg acagcagagg agaacttaca gaagtcttct ggaggtgttc 1380 ctggccagaa cacaggagga caggtaagat tgggagaccc tttgacattg gagcaaggcg 1440 ctcaagaagt tagagaaggt gacggtacaa gggtctcaga aattaactac tggtaactgt 1500 aattgggcgc taagtctagt agacttattt catgatacca actttgtaaa agaaaaggac 1560 tggcagctga gggatgtcat tccattgctg gaagatgtaa ctcagacgct gtcaggacaa 1620 gaaagagagg cctttgaaag aacatggtgg gcaatttctg ctgtaaagat gggcctccag 1680 attaataatg tagtagatgg aaaggcatca ttccagctcc taagagcgaa atatgaaaag 1740 aagactgcta ataaaaagca gtctgagccc tctgaagaat atctctagag tcgacgctct 1800 cattacttgt aacaaaggga gggaaagtat gggaggacag acaccatggg aagtatttat 1860 cactaatcaa gcacaagtaa tacatgagaa acttttacta cagcaagcac aatcctccaa 1920 aaaattttgt ttttacaaaa tccctggtga acatggtcga ctctagaact agtggatccc 1980 ccgggctgca ggagtgggga ggcacgatgg ccgctttggt cgaggcggat ccggccatta 2040 gccatattat tcattggtta tatagcataa atcaatattg gctattggcc attgcatacg 2100 ttgtatccat atcataatat gtacatttat attggctcat gtccaacatt accgccatgt 2160 tgacattgat tattgactag ttattaatag taatcaatta cggggtcatt agttcatagc 2220 ccatatatgg agttccgcgt tacataactt acggtaaatg gcccgcctgg ctgaccgccc 2280 aacgaccccc gcccattgac gtcaataatg acgtatgttc ccatagtaac gccaataggg 2340 actttccatt gacgtcaatg ggtggagtat ttacggtaaa ctgcccactt ggcagtacat 2400 caagtgtatc atatgccaag tacgccccct attgacgtca atgacggtaa atggcccgcc 2460 tggcattatg cccagtacat gaccttatgg gactttccta cttggcagta catctacgta 2520 ttagtcatcg ctattaccat ggtgatgcgg ttttggcagt acatcaatgg gcgtggatag 2580 cggtttgact cacggggatt tccaagtctc caccccattg acgtcaatgg gagtttgttt 2640 tggcaccaaa atcaacggga ctttccaaaa tgtcgtaaca actccgcccc attgacgcaa 2700 atgggcggta ggcatgtacg gtgggaggtc tatataagca gagctcgttt agtgaaccgt 2760 cagatcgcct ggagacgcca tccacgctgt tttgacctcc atagaagaca ccgggaccga 2820 tccagcctcc gcgggccacc atgtttgact gtatggatgt tctgtcagtg agtcccgggc 2880 agatcctgga tttctacacc gcgagccctt cctcctgcat gctgcaggaa aaggctctca 2940 aagcctgcct cagtggattc acccaggccg aatggcagca ccggcatact gctcaatcca 3000 tcgagacaca gagtaccagc tctgaggagc tcgtcccgag cccaccatct ccacttcctc 3060 ctcctcgggt gtacaagccc tgcttcgttt gccaggacaa gtcatcgggc taccactatg 3120 gcgtcagtgc ctgcgagggg tgcaagggct ttttccgcag aagtattcag aagaacatga 3180 tctacacttg ccatcgagat aagaactgcg tcattaacaa ggtcactagg aaccgatgcc 3240 agtactgccg cctgcagaag tgctttgaag tgggcatgtc caaagagtct gttaggaatg 3300 acaggaacaa gaaaaagaag gagccttcaa agcaggaatg cacagagagc tatgagatga 3360 cagcggagct agacgacctc actgagaaga tccggaaagc ccaccaggaa acctttccct 3420 cactctgcca gctgggtaaa tacaccacga attccagcgc tgaccaccgg gtccgattgg 3480 acttgggcct ctgggacaaa ttcagtgagc tggccaccaa gtgcattatt aagatcgtgg 3540 agttcgccaa gcgtctgccg ggcttcacag gtctgaccat cgcagaccag atcaccctgc 3600 tcaaagccgc ctgcttggat atcttgattc tcagaatttg taccaggtat accccagagc 3660 aagacaccat gactttctct gatggcctta cactaaatcg aactcagatg cacaatgctg 3720 gcttcggtcc tctgactgac cttgtgttca cctttgccaa ccagctcctg cctttggaaa 3780 tggatgacac agaaacaggc cttctcagtg ccatctgttt aatctgtgga gaccgccagg 3840 accttgagga accaacaaaa gtagacaagc tccaagaacc actgctggaa gcactaaaga 3900 tttacattag aaaacgacga cccagcaagc ctcacatgtt tccaaagatc ttaatgaaaa 3960 tcacagatct ccgcagcatc agcgcgaaag gtgccgaacg tgtaattacc ttgaaaatgg 4020 aaattcctgg atcaatgcca cctctcattc aggaaatgct ggagaattct gaaggacatg 4080 aacccttgac cccaagttca agtgggaata tagcagagca cagtcccagc gtgtccccca 4140 gctcagtgga gaacagtgga gtcagtcagt caccactgct gcagtgagcg gccgcgactc 4200 tagagtcgac ctcgaggggg ggcccggacc tactagggtg ctgtggaagg gtgatggtgc 4260 agtagtagtt aatgatgaag gaaagggaat aattgctgta ccattaacca ggactaagtt 4320 actaataaaa ccaaattgag tattgttgca ggaagcaaga cccaactacc attgtcagct 4380 gtgtttcctg acctcaatat ttgttataag gtttgatatg aatcccaggg ggaatctcaa 4440 cccctattac ccaacagtca gaaaaatcta agtgtgagga gaacacaatg tttcaacctt 4500 attgttataa taatgacagt aagaacagca tggcagaatc gaaggaagca agagaccaag 4560 aatgaacctg aaagaagaat ctaaagaaga aaaaagaaga aatgactggt ggaaaatagg 4620 tatgtttctg ttatgcttag caggaactac tggaggaata ctttggtggt atgaaggact 4680 cccacagcaa cattatatag ggttggtggc gataggggga agattaaacg gatctggcca 4740 atcaaatgct atagaatgct ggggttcctt cccggggtgt agaccatttc aaaattactt 4800 cagttatgag accaatagaa gcatgcatat ggataataat actgctacat tattagaagc 4860 tttaaccaat ataactgctc tataaataac aaaacagaat tagaaacatg gaagttagta 4920 aagacttctg gcataactcc tttacctatt tcttctgaag ctaacactgg actaattaga 4980 cataagagag attttggtat aagtgcaata gtggcagcta ttgtagccgc tactgctatt 5040 gctgctagcg ctactatgtc ttatgttgct ctaactgagg ttaacaaaat aatggaagta 5100 caaaatcata cttttgaggt agaaaatagt actctaaatg gtatggattt aatagaacga 5160 caaataaaga tattatatgc tatgattctt caaacacatg cagatgttca actgttaaag 5220 gaaagacaac aggtagagga gacatttaat ttaattggat gtatagaaag aacacatgta 5280 ttttgtcata ctggtcatcc ctggaatatg tcatggggac atttaaatga gtcaacacaa 5340 tgggatgact gggtaagcaa aatggaagat ttaaatcaag agatactaac tacacttcat 5400 ggagccagga acaatttggc acaatccatg ataacattca atacaccaga tagtatagct 5460 caatttggaa aagacctttg gagtcatatt ggaaattgga ttcctggatt gggagcttcc 5520 attataaaat atatagtgat gtttttgctt atttatttgt tactaacctc ttcgcctaag 5580 atcctcaggg ccctctggaa ggtgaccagt ggtgcagggt cctccggcag tcgttacctg 5640 aagaaaaaat tccatcacaa acatgcatcg cgagaagaca cctgggacca ggcccaacac 5700 aacatacacc tagcaggcgt gaccggtgga tcaggggaca aatactacaa gcagaagtac 5760 tccaggaacg actggaatgg agaatcagag gagtacaaca ggcggccaaa gagctgggtg 5820 aagtcaatcg aggcatttgg agagagctat atttccgaga agaccaaagg ggagatttct 5880 cagcctgggg cggctatcaa cgagcacaag aacggctctg gggggaacaa tcctcaccaa 5940 gggtccttag acctggagat tcgaagcgaa ggaggaaaca tttatgactg ttgcattaaa 6000 gcccaagaag gaactctcgc tatcccttgc tgtggatttc ccttatggct attttgggga 6060 ctagtaatta tagtaggacg catagcaggc tatggattac gtggactcgc tgttataata 6120 aggatttgta ttagaggctt aaatttgata tttgaaataa tcagaaaaat gcttgattat 6180 attggaagag ctttaaatcc tggcacatct catgtatcaa tgcctcagta tgtttagaaa 6240 aacaaggggg gaactgtggg gtttttatga ggggttttat aaatgattat aagagtaaaa 6300 agaaagttgc tgatgctctc ataaccttgt ataacccaaa ggactagctc atgttgctag 6360 gcaactaaac cgcaataacc gcatttgtga cgcgagttcc ccattggtga cgcgttaact 6420 tcctgttttt acagtatata agtgcttgta ttctgacaat tgggcactca gattctgcgg 6480 tctgagtccc ttctctgctg ggctgaaaag gcctttgtaa taaatataat tctctactca 6540 gtccctgtct ctagtttgtc tgttcgagat cctacagagc tcatgccttg gcgtaatcat 6600 ggtcatagct gtttcctgtg tgaaattgtt atccgctcac aattccacac aacatacgag 6660 ccggaagcat aaagtgtaaa gcctggggtg cctaatgagt gagctaactc acattaattg 6720 cgttgcgctc actgcccgct ttccagtcgg gaaacctgtc gtgccagctg cattaatgaa 6780 tcggccaacg cgcggggaga ggcggtttgc gtattgggcg ctcttccgct tcctcgctca 6840 ctgactcgct gcgctcggtc gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg 6900 taatacggtt atccacagaa tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc 6960 agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc gtttttccat aggctccgcc 7020 cccctgacga gcatcacaaa aatcgacgct caagtcagag gtggcgaaac ccgacaggac 7080 tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct gttccgaccc 7140 tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg ctttctcata 7200 gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc 7260 acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt cttgagtcca 7320 acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg attagcagag 7380 cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac ggctacacta 7440 gaaggacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga aaaagagttg 7500 gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt gtttgcaagc 7560 agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt tctacggggt 7620 ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa 7680 ggatcttcac ctagatcctt ttaaattaaa aatgaagttt taaatcaatc taaagtatat 7740 atgagtaaac ttggtctgac agttaccaat gcttaatcag tgaggcacct atctcagcga 7800 tctgtctatt tcgttcatcc atagttgcct gactccccgt cgtgtagata actacgatac 7860 gggagggctt accatctggc cccagtgctg caatgatacc gcgagaccca cgctcaccgg 7920 ctccagattt atcagcaata aaccagccag ccggaagggc cgagcgcaga agtggtcctg 7980 caactttatc cgcctccatc cagtctatta attgttgccg ggaagctaga gtaagtagtt 8040 cgccagttaa tagtttgcgc aacgttgttg ccattgctac aggcatcgtg gtgtcacgct 8100 cgtcgtttgg tatggcttca ttcagctccg gttcccaacg atcaaggcga gttacatgat 8160 cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc tccgatcgtt gtcagaagta 8220 agttggccgc agtgttatca ctcatggtta tggcagcact gcataattct cttactgtca 8280 tgccatccgt aagatgcttt tctgtgactg gtgagtactc aaccaagtca ttctgagaat 8340 agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat acgggataat accgcgccac 8400 atagcagaac tttaaaagtg ctcatcattg gaaaacgttc ttcggggcga aaactctcaa 8460 ggatcttacc gctgttgaga tccagttcga tgtaacccac tcgtgcaccc aactgatctt 8520 cagcatcttt tactttcacc agcgtttctg ggtgagcaaa aacaggaagg caaaatgccg 8580 caaaaaaggg aataagggcg acacggaaat gttgaatact catactcttc ctttttcaat 8640 attattgaag catttatcag ggttattgtc tcatgagcgg atacatattt gaatgtattt 8700 agaaaaataa acaaataggg gttccgcgca catttccccg aaaagtgcca cctaaattgt 8760 aagcgttaat attttgttaa aattcgcgtt aaatttttgt taaatcagct cattttttaa 8820 ccaataggcc gaaatcggca aaatccctta taaatcaaaa gaatagaccg agatagggtt 8880 gagtgttgtt ccagtttgga acaagagtcc actattaaag aacgtggact ccaacgtcaa 8940 agggcgaaaa accgtctatc agggcgatgg cccactacgt gaaccatcac cctaatcaag 9000 ttttttgggg tcgaggtgcc gtaaagcact aaatcggaac cctaaaggga gcccccgatt 9060 tagagcttga cggggaaagc caacctggct tatcgaaatt aatacgactc actataggga 9120 gaccggc 9127 10 9151 DNA Artificial Sequence Description of Artificial Sequence pONY-FLAG-RARbeta2 vector genome plasmid 10 agatcttgaa taataaaatg tgtgtttgtc cgaaatacgc gttttgagat ttctgtcgcc 60 gactaaattc atgtcgcgcg atagtggtgt ttatcgccga tagagatggc gatattggaa 120 aaattgatat ttgaaaatat ggcatattga aaatgtcgcc gatgtgagtt tctgtgtaac 180 tgatatcgcc atttttccaa aagtgatttt tgggcatacg cgatatctgg cgatagcgct 240 tatatcgttt acgggggatg gcgatagacg actttggtga cttgggcgat tctgtgtgtc 300 gcaaatatcg cagtttcgat ataggtgaca gacgatatga ggctatatcg ccgatagagg 360 cgacatcaag ctggcacatg gccaatgcat atcgatctat acattgaatc aatattggcc 420 attagccata ttattcattg gttatatagc ataaatcaat attggctatt ggccattgca 480 tacgttgtat ccatatcgta atatgtacat ttatattggc tcatgtccaa cattaccgcc 540 atgttgacat tgattattga ctagttatta atagtaatca attacggggt cattagttca 600 tagcccatat atggagttcc gcgttacata acttacggta aatggcccgc ctggctgacc 660 gcccaacgac ccccgcccat tgacgtcaat aatgacgtat gttcccatag taacgccaat 720 agggactttc cattgacgtc aatgggtgga gtatttacgg taaactgccc acttggcagt 780 acatcaagtg tatcatatgc caagtccgcc ccctattgac gtcaatgacg gtaaatggcc 840 cgcctggcat tatgcccagt acatgacctt acgggacttt cctacttggc agtacatcta 900 cgtattagtc atcgctatta ccatggtgat gcggttttgg cagtacacca atgggcgtgg 960 atagcggttt gactcacggg gatttccaag tctccacccc attgacgtca atgggagttt 1020 gttttggcac caaaatcaac gggactttcc aaaatgtcgt aacaactgcg atcgcccgcc 1080 ccgttgacgc aaatgggcgg taggcgtgta cggtgggagg tctatataag cagagctcgt 1140 ttagtgaacc gggcactcag attctgcggt ctgagtccct tctctgctgg gctgaaaagg 1200 cctttgtaat aaatataatt ctctactcag tccctgtctc tagtttgtct gttcgagatc 1260 ctacagttgg cgcccgaaca gggacctgag aggggcgcag accctacctg ttgaacctgg 1320 ctgatcgtag gatccccggg acagcagagg agaacttaca gaagtcttct ggaggtgttc 1380 ctggccagaa cacaggagga caggtaagat tgggagaccc tttgacattg gagcaaggcg 1440 ctcaagaagt tagagaaggt gacggtacaa gggtctcaga aattaactac tggtaactgt 1500 aattgggcgc taagtctagt agacttattt catgatacca actttgtaaa agaaaaggac 1560 tggcagctga gggatgtcat tccattgctg gaagatgtaa ctcagacgct gtcaggacaa 1620 gaaagagagg cctttgaaag aacatggtgg gcaatttctg ctgtaaagat gggcctccag 1680 attaataatg tagtagatgg aaaggcatca ttccagctcc taagagcgaa atatgaaaag 1740 aagactgcta ataaaaagca gtctgagccc tctgaagaat atctctagag tcgacgctct 1800 cattacttgt aacaaaggga gggaaagtat gggaggacag acaccatggg aagtatttat 1860 cactaatcaa gcacaagtaa tacatgagaa acttttacta cagcaagcac aatcctccaa 1920 aaaattttgt ttttacaaaa tccctggtga acatggtcga ctctagaact agtggatccc 1980 ccgggctgca ggagtgggga ggcacgatgg ccgctttggt cgaggcggat ccggccatta 2040 gccatattat tcattggtta tatagcataa atcaatattg gctattggcc attgcatacg 2100 ttgtatccat atcataatat gtacatttat attggctcat gtccaacatt accgccatgt 2160 tgacattgat tattgactag ttattaatag taatcaatta cggggtcatt agttcatagc 2220 ccatatatgg agttccgcgt tacataactt acggtaaatg gcccgcctgg ctgaccgccc 2280 aacgaccccc gcccattgac gtcaataatg acgtatgttc ccatagtaac gccaataggg 2340 actttccatt gacgtcaatg ggtggagtat ttacggtaaa ctgcccactt ggcagtacat 2400 caagtgtatc atatgccaag tacgccccct attgacgtca atgacggtaa atggcccgcc 2460 tggcattatg cccagtacat gaccttatgg gactttccta cttggcagta catctacgta 2520 ttagtcatcg ctattaccat ggtgatgcgg ttttggcagt acatcaatgg gcgtggatag 2580 cggtttgact cacggggatt tccaagtctc caccccattg acgtcaatgg gagtttgttt 2640 tggcaccaaa atcaacggga ctttccaaaa tgtcgtaaca actccgcccc attgacgcaa 2700 atgggcggta ggcatgtacg gtgggaggtc tatataagca gagctcgttt agtgaaccgt 2760 cagatcgcct ggagacgcca tccacgctgt tttgacctcc atagaagaca ccgggaccga 2820 tccagcctcc gcgggccacc atggactaca aggacgacga tgacaagttt gactgtatgg 2880 atgttctgtc agtgagtccc gggcagatcc tggatttcta caccgcgagc ccttcctcct 2940 gcatgctgca ggaaaaggct ctcaaagcct gcctcagtgg attcacccag gccgaatggc 3000 agcaccggca tactgctcaa tccatcgaga cacagagtac cagctctgag gagctcgtcc 3060 cgagcccacc atctccactt cctcctcctc gggtgtacaa gccctgcttc gtttgccagg 3120 acaagtcatc gggctaccac tatggcgtca gtgcctgcga ggggtgcaag ggctttttcc 3180 gcagaagtat tcagaagaac atgatctaca cttgccatcg agataagaac tgcgtcatta 3240 acaaggtcac taggaaccga tgccagtact gccgcctgca gaagtgcttt gaagtgggca 3300 tgtccaaaga gtctgttagg aatgacagga acaagaaaaa gaaggagcct tcaaagcagg 3360 aatgcacaga gagctatgag atgacagcgg agctagacga cctcactgag aagatccgga 3420 aagcccacca ggaaaccttt ccctcactct gccagctggg taaatacacc acgaattcca 3480 gcgctgacca ccgggtccga ttggacttgg gcctctggga caaattcagt gagctggcca 3540 ccaagtgcat tattaagatc gtggagttcg ccaagcgtct gccgggcttc acaggtctga 3600 ccatcgcaga ccagatcacc ctgctcaaag ccgcctgctt ggatatcttg attctcagaa 3660 tttgtaccag gtatacccca gagcaagaca ccatgacttt ctctgatggc cttacactaa 3720 atcgaactca gatgcacaat gctggcttcg gtcctctgac tgaccttgtg ttcacctttg 3780 ccaaccagct cctgcctttg gaaatggatg acacagaaac aggccttctc agtgccatct 3840 gtttaatctg tggagaccgc caggaccttg aggaaccaac aaaagtagac aagctccaag 3900 aaccactgct ggaagcacta aagatttaca ttagaaaacg acgacccagc aagcctcaca 3960 tgtttccaaa gatcttaatg aaaatcacag atctccgcag catcagcgcg aaaggtgccg 4020 aacgtgtaat taccttgaaa atggaaattc ctggatcaat gccacctctc attcaggaaa 4080 tgctggagaa ttctgaagga catgaaccct tgaccccaag ttcaagtggg aatatagcag 4140 agcacagtcc cagcgtgtcc cccagctcag tggagaacag tggagtcagt cagtcaccac 4200 tgctgcagtg agcggccgcg actctagagt cgacctcgag ggggggcccg gacctactag 4260 ggtgctgtgg aagggtgatg gtgcagtagt agttaatgat gaaggaaagg gaataattgc 4320 tgtaccatta accaggacta agttactaat aaaaccaaat tgagtattgt tgcaggaagc 4380 aagacccaac taccattgtc agctgtgttt cctgacctca atatttgtta taaggtttga 4440 tatgaatccc agggggaatc tcaaccccta ttacccaaca gtcagaaaaa tctaagtgtg 4500 aggagaacac aatgtttcaa ccttattgtt ataataatga cagtaagaac agcatggcag 4560 aatcgaagga agcaagagac caagaatgaa cctgaaagaa gaatctaaag aagaaaaaag 4620 aagaaatgac tggtggaaaa taggtatgtt tctgttatgc ttagcaggaa ctactggagg 4680 aatactttgg tggtatgaag gactcccaca gcaacattat atagggttgg tggcgatagg 4740 gggaagatta aacggatctg gccaatcaaa tgctatagaa tgctggggtt ccttcccggg 4800 gtgtagacca tttcaaaatt acttcagtta tgagaccaat agaagcatgc atatggataa 4860 taatactgct acattattag aagctttaac caatataact gctctataaa taacaaaaca 4920 gaattagaaa catggaagtt agtaaagact tctggcataa ctcctttacc tatttcttct 4980 gaagctaaca ctggactaat tagacataag agagattttg gtataagtgc aatagtggca 5040 gctattgtag ccgctactgc tattgctgct agcgctacta tgtcttatgt tgctctaact 5100 gaggttaaca aaataatgga agtacaaaat catacttttg aggtagaaaa tagtactcta 5160 aatggtatgg atttaataga acgacaaata aagatattat atgctatgat tcttcaaaca 5220 catgcagatg ttcaactgtt aaaggaaaga caacaggtag aggagacatt taatttaatt 5280 ggatgtatag aaagaacaca tgtattttgt catactggtc atccctggaa tatgtcatgg 5340 ggacatttaa atgagtcaac acaatgggat gactgggtaa gcaaaatgga agatttaaat 5400 caagagatac taactacact tcatggagcc aggaacaatt tggcacaatc catgataaca 5460 ttcaatacac cagatagtat agctcaattt ggaaaagacc tttggagtca tattggaaat 5520 tggattcctg gattgggagc ttccattata aaatatatag tgatgttttt gcttatttat 5580 ttgttactaa cctcttcgcc taagatcctc agggccctct ggaaggtgac cagtggtgca 5640 gggtcctccg gcagtcgtta cctgaagaaa aaattccatc acaaacatgc atcgcgagaa 5700 gacacctggg accaggccca acacaacata cacctagcag gcgtgaccgg tggatcaggg 5760 gacaaatact acaagcagaa gtactccagg aacgactgga atggagaatc agaggagtac 5820 aacaggcggc caaagagctg ggtgaagtca atcgaggcat ttggagagag ctatatttcc 5880 gagaagacca aaggggagat ttctcagcct ggggcggcta tcaacgagca caagaacggc 5940 tctgggggga acaatcctca ccaagggtcc ttagacctgg agattcgaag cgaaggagga 6000 aacatttatg actgttgcat taaagcccaa gaaggaactc tcgctatccc ttgctgtgga 6060 tttcccttat ggctattttg gggactagta attatagtag gacgcatagc aggctatgga 6120 ttacgtggac tcgctgttat aataaggatt tgtattagag gcttaaattt gatatttgaa 6180 ataatcagaa aaatgcttga ttatattgga agagctttaa atcctggcac atctcatgta 6240 tcaatgcctc agtatgttta gaaaaacaag gggggaactg tggggttttt atgaggggtt 6300 ttataaatga ttataagagt aaaaagaaag ttgctgatgc tctcataacc ttgtataacc 6360 caaaggacta gctcatgttg ctaggcaact aaaccgcaat aaccgcattt gtgacgcgag 6420 ttccccattg gtgacgcgtt aacttcctgt ttttacagta tataagtgct tgtattctga 6480 caattgggca ctcagattct gcggtctgag tcccttctct gctgggctga aaaggccttt 6540 gtaataaata taattctcta ctcagtccct gtctctagtt tgtctgttcg agatcctaca 6600 gagctcatgc cttggcgtaa tcatggtcat agctgtttcc tgtgtgaaat tgttatccgc 6660 tcacaattcc acacaacata cgagccggaa gcataaagtg taaagcctgg ggtgcctaat 6720 gagtgagcta actcacatta attgcgttgc gctcactgcc cgctttccag tcgggaaacc 6780 tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 6840 ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag 6900 cggtatcagc tcactcaaag gcggtaatac ggttatccac agaatcaggg gataacgcag 6960 gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc 7020 tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga cgctcaagtc 7080 agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 7140 tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 7200 cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg 7260 ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 7320 ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 7380 ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 7440 ggtggcctaa ctacggctac actagaagga cagtatttgg tatctgcgct ctgctgaagc 7500 cagttacctt cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta 7560 gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag 7620 atcctttgat cttttctacg gggtctgacg ctcagtggaa cgaaaactca cgttaaggga 7680 ttttggtcat gagattatca aaaaggatct tcacctagat ccttttaaat taaaaatgaa 7740 gttttaaatc aatctaaagt atatatgagt aaacttggtc tgacagttac caatgcttaa 7800 tcagtgaggc acctatctca gcgatctgtc tatttcgttc atccatagtt gcctgactcc 7860 ccgtcgtgta gataactacg atacgggagg gcttaccatc tggccccagt gctgcaatga 7920 taccgcgaga cccacgctca ccggctccag atttatcagc aataaaccag ccagccggaa 7980 gggccgagcg cagaagtggt cctgcaactt tatccgcctc catccagtct attaattgtt 8040 gccgggaagc tagagtaagt agttcgccag ttaatagttt gcgcaacgtt gttgccattg 8100 ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc tccggttccc 8160 aacgatcaag gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt agctccttcg 8220 gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg gttatggcag 8280 cactgcataa ttctcttact gtcatgccat ccgtaagatg cttttctgtg actggtgagt 8340 actcaaccaa gtcattctga gaatagtgta tgcggcgacc gagttgctct tgcccggcgt 8400 caatacggga taataccgcg ccacatagca gaactttaaa agtgctcatc attggaaaac 8460 gttcttcggg gcgaaaactc tcaaggatct taccgctgtt gagatccagt tcgatgtaac 8520 ccactcgtgc acccaactga tcttcagcat cttttacttt caccagcgtt tctgggtgag 8580 caaaaacagg aaggcaaaat gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa 8640 tactcatact cttccttttt caatattatt gaagcattta tcagggttat tgtctcatga 8700 gcggatacat atttgaatgt atttagaaaa ataaacaaat aggggttccg cgcacatttc 8760 cccgaaaagt gccacctaaa ttgtaagcgt taatattttg ttaaaattcg cgttaaattt 8820 ttgttaaatc agctcatttt ttaaccaata ggccgaaatc ggcaaaatcc cttataaatc 8880 aaaagaatag accgagatag ggttgagtgt tgttccagtt tggaacaaga gtccactatt 8940 aaagaacgtg gactccaacg tcaaagggcg aaaaaccgtc tatcagggcg atggcccact 9000 acgtgaacca tcaccctaat caagtttttt ggggtcgagg tgccgtaaag cactaaatcg 9060 gaaccctaaa gggagccccc gatttagagc ttgacgggga aagccaacct ggcttatcga 9120 aattaatacg actcactata gggagaccgg c 9151 11 8528 DNA Artificial Sequence Description of Artificial Sequence pONY8G 5′cPPT POS delCTS EIAV vector genome plasmid 11 agatcttgaa taataaaatg tgtgtttgtc cgaaatacgc gttttgagat ttctgtcgcc 60 gactaaattc atgtcgcgcg atagtggtgt ttatcgccga tagagatggc gatattggaa 120 aaattgatat ttgaaaatat ggcatattga aaatgtcgcc gatgtgagtt tctgtgtaac 180 tgatatcgcc atttttccaa aagtgatttt tgggcatacg cgatatctgg cgatagcgct 240 tatatcgttt acgggggatg gcgatagacg actttggtga cttgggcgat tctgtgtgtc 300 gcaaatatcg cagtttcgat ataggtgaca gacgatatga ggctatatcg ccgatagagg 360 cgacatcaag ctggcacatg gccaatgcat atcgatctat acattgaatc aatattggcc 420 attagccata ttattcattg gttatatagc ataaatcaat attggctatt ggccattgca 480 tacgttgtat ccatatcgta atatgtacat ttatattggc tcatgtccaa cattaccgcc 540 atgttgacat tgattattga ctagttatta atagtaatca attacggggt cattagttca 600 tagcccatat atggagttcc gcgttacata acttacggta aatggcccgc ctggctgacc 660 gcccaacgac ccccgcccat tgacgtcaat aatgacgtat gttcccatag taacgccaat 720 agggactttc cattgacgtc aatgggtgga gtatttacgg taaactgccc acttggcagt 780 acatcaagtg tatcatatgc caagtccgcc ccctattgac gtcaatgacg gtaaatggcc 840 cgcctggcat tatgcccagt acatgacctt acgggacttt cctacttggc agtacatcta 900 cgtattagtc atcgctatta ccatggtgat gcggttttgg cagtacacca atgggcgtgg 960 atagcggttt gactcacggg gatttccaag tctccacccc attgacgtca atgggagttt 1020 gttttggcac caaaatcaac gggactttcc aaaatgtcgt aacaactgcg atcgcccgcc 1080 ccgttgacgc aaatgggcgg taggcgtgta cggtgggagg tctatataag cagagctcgt 1140 ttagtgaacc gggcactcag attctgcggt ctgagtccct tctctgctgg gctgaaaagg 1200 cctttgtaat aaatataatt ctctactcag tccctgtctc tagtttgtct gttcgagatc 1260 ctacagttgg cgcccgaaca gggacctgag aggggcgcag accctacctg ttgaacctgg 1320 ctgatcgtag gatccccggg acagcagagg agaacttaca gaagtcttct ggaggtgttc 1380 ctggccagaa cacaggagga caggtaagat tgggagaccc tttgacattg gagcaaggcg 1440 ctcaagaagt tagagaaggt gacggtacaa gggtctcaga aattaactac tggtaactgt 1500 aattgggcgc taagtctagt agacttattt catgatacca actttgtaaa agaaaaggac 1560 tggcagctga gggatgtcat tccattgctg gaagatgtaa ctcagacgct gtcaggacaa 1620 gaaagagagg cctttgaaag aacatggtgg gcaatttctg ctgtaaagat gggcctccag 1680 attaataatg tagtagatgg aaaggcatca ttccagctcc taagagcgaa atatgaaaag 1740 aagactgcta ataaaaagca gtctgagccc tctgaagaat atctctagag tcgacgctct 1800 cattacttgt aacaaaggga gggaaagtat gggaggacag acaccatggg aagtatttat 1860 cactaatcaa gcacaagtaa tacatgagaa acttttacta cagcaagcac aatcctccaa 1920 aaaattttgt ttttacaaaa tccctggtga acatggtcga ctctagaact agtggatccc 1980 ccgggctgca ggagtgggga ggcacgatgg ccgctttggt cgaggcggat ccggccatta 2040 gccatattat tcattggtta tatagcataa atcaatattg gctattggcc attgcatacg 2100 ttgtatccat atcataatat gtacatttat attggctcat gtccaacatt accgccatgt 2160 tgacattgat tattgactag ttattaatag taatcaatta cggggtcatt agttcatagc 2220 ccatatatgg agttccgcgt tacataactt acggtaaatg gcccgcctgg ctgaccgccc 2280 aacgaccccc gcccattgac gtcaataatg acgtatgttc ccatagtaac gccaataggg 2340 actttccatt gacgtcaatg ggtggagtat ttacggtaaa ctgcccactt ggcagtacat 2400 caagtgtatc atatgccaag tacgccccct attgacgtca atgacggtaa atggcccgcc 2460 tggcattatg cccagtacat gaccttatgg gactttccta cttggcagta catctacgta 2520 ttagtcatcg ctattaccat ggtgatgcgg ttttggcagt acatcaatgg gcgtggatag 2580 cggtttgact cacggggatt tccaagtctc caccccattg acgtcaatgg gagtttgttt 2640 tggcaccaaa atcaacggga ctttccaaaa tgtcgtaaca actccgcccc attgacgcaa 2700 atgggcggta ggcatgtacg gtgggaggtc tatataagca gagctcgttt agtgaaccgt 2760 cagatcgcct ggagacgcca tccacgctgt tttgacctcc atagaagaca ccgggaccga 2820 tccagcctcc gcggccccaa gcttgttggg atccaccggt cgccaccatg gtgagcaagg 2880 gcgaggagct gttcaccggg gtggtgccca tcctggtcga gctggacggc gacgtaaacg 2940 gccacaagtt cagcgtgtcc ggcgagggcg agggcgatgc cacctacggc aagctgaccc 3000 tgaagttcat ctgcaccacc ggcaagctgc ccgtgccctg gcccaccctc gtgaccaccc 3060 tgacctacgg cgtgcagtgc ttcagccgct accccgacca catgaagcag cacgacttct 3120 tcaagtccgc catgcccgaa ggctacgtcc aggagcgcac catcttcttc aaggacgacg 3180 gcaactacaa gacccgcgcc gaggtgaagt tcgagggcga caccctggtg aaccgcatcg 3240 agctgaaggg catcgacttc aaggaggacg gcaacatcct ggggcacaag ctggagtaca 3300 actacaacag ccacaacgtc tatatcatgg ccgacaagca gaagaacggc atcaaggtga 3360 acttcaagat ccgccacaac atcgaggacg gcagcgtgca gctcgccgac cactaccagc 3420 agaacacccc catcggcgac ggccccgtgc tgctgcccga caaccactac ctgagcaccc 3480 agtccgccct gagcaaagac cccaacgaga agcgcgatca catggtcctg ctggagttcg 3540 tgaccgccgc cgggatcact ctcggcatgg acgagctgta caagtaaagc ggccgcgact 3600 ctagagtcga cctcgagggg gggcccggac ctactagggt gctgtggaag ggtgatggtg 3660 cagtagtagt taatgatgaa ggaaagggaa taattgctgt accattaacc aggactaagt 3720 tactaataaa accaaattga gtattgttgc aggaagcaag acccaactac cattgtcagc 3780 tgtgtttcct gacctcaata tttgttataa ggtttgatat gaatcccagg gggaatctca 3840 acccctatta cccaacagtc agaaaaatct aagtgtgagg agaacacaat gtttcaacct 3900 tattgttata ataatgacag taagaacagc atggcagaat cgaaggaagc aagagaccaa 3960 gaatgaacct gaaagaagaa tctaaagaag aaaaaagaag aaatgactgg tggaaaatag 4020 gtatgtttct gttatgctta gcaggaacta ctggaggaat actttggtgg tatgaaggac 4080 tcccacagca acattatata gggttggtgg cgataggggg aagattaaac ggatctggcc 4140 aatcaaatgc tatagaatgc tggggttcct tcccggggtg tagaccattt caaaattact 4200 tcagttatga gaccaataga agcatgcata tggataataa tactgctaca ttattagaag 4260 ctttaaccaa tataactgct ctataaataa caaaacagaa ttagaaacat ggaagttagt 4320 aaagacttct ggcataactc ctttacctat ttcttctgaa gctaacactg gactaattag 4380 acataagaga gattttggta taagtgcaat agtggcagct attgtagccg ctactgctat 4440 tgctgctagc gctactatgt cttatgttgc tctaactgag gttaacaaaa taatggaagt 4500 acaaaatcat acttttgagg tagaaaatag tactctaaat ggtatggatt taatagaacg 4560 acaaataaag atattatatg ctatgattct tcaaacacat gcagatgttc aactgttaaa 4620 ggaaagacaa caggtagagg agacatttaa tttaattgga tgtatagaaa gaacacatgt 4680 attttgtcat actggtcatc cctggaatat gtcatgggga catttaaatg agtcaacaca 4740 atgggatgac tgggtaagca aaatggaaga tttaaatcaa gagatactaa ctacacttca 4800 tggagccagg aacaatttgg cacaatccat gataacattc aatacaccag atagtatagc 4860 tcaatttgga aaagaccttt ggagtcatat tggaaattgg attcctggat tgggagcttc 4920 cattataaaa tatatagtga tgtttttgct tatttatttg ttactaacct cttcgcctaa 4980 gatcctcagg gccctctgga aggtgaccag tggtgcaggg tcctccggca gtcgttacct 5040 gaagaaaaaa ttccatcaca aacatgcatc gcgagaagac acctgggacc aggcccaaca 5100 caacatacac ctagcaggcg tgaccggtgg atcaggggac aaatactaca agcagaagta 5160 ctccaggaac gactggaatg gagaatcaga ggagtacaac aggcggccaa agagctgggt 5220 gaagtcaatc gaggcatttg gagagagcta tatttccgag aagaccaaag gggagatttc 5280 tcagcctggg gcggctatca acgagcacaa gaacggctct ggggggaaca atcctcacca 5340 agggtcctta gacctggaga ttcgaagcga aggaggaaac atttatgact gttgcattaa 5400 agcccaagaa ggaactctcg ctatcccttg ctgtggattt cccttatggc tattttgggg 5460 actagtaatt atagtaggac gcatagcagg ctatggatta cgtggactcg ctgttataat 5520 aaggatttgt attagaggct taaatttgat atttgaaata atcagaaaaa tgcttgatta 5580 tattggaaga gctttaaatc ctggcacatc tcatgtatca atgcctcagt atgtttagaa 5640 aaacaagggg ggaactgtgg ggtttttatg aggggtttta taaatgatta taagagtaaa 5700 aagaaagttg ctgatgctct cataaccttg tataacccaa aggactagct catgttgcta 5760 ggcaactaaa ccgcaataac cgcatttgtg acgcgagttc cccattggtg acgcgttaac 5820 ttcctgtttt tacagtatat aagtgcttgt attctgacaa ttgggcactc agattctgcg 5880 gtctgagtcc cttctctgct gggctgaaaa ggcctttgta ataaatataa ttctctactc 5940 agtccctgtc tctagtttgt ctgttcgaga tcctacagag ctcatgcctt ggcgtaatca 6000 tggtcatagc tgtttcctgt gtgaaattgt tatccgctca caattccaca caacatacga 6060 gccggaagca taaagtgtaa agcctggggt gcctaatgag tgagctaact cacattaatt 6120 gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt cgtgccagct gcattaatga 6180 atcggccaac gcgcggggag aggcggtttg cgtattgggc gctcttccgc ttcctcgctc 6240 actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg 6300 gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg agcaaaaggc 6360 cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca taggctccgc 6420 ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga 6480 ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc tgttccgacc 6540 ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc gctttctcat 6600 agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct gggctgtgtg 6660 cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg tcttgagtcc 6720 aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga 6780 gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact 6840 agaaggacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt 6900 ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag 6960 cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg 7020 tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag attatcaaaa 7080 aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata 7140 tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc tatctcagcg 7200 atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat aactacgata 7260 cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc acgctcaccg 7320 gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag aagtggtcct 7380 gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag agtaagtagt 7440 tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt ggtgtcacgc 7500 tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg agttacatga 7560 tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt tgtcagaagt 7620 aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc tcttactgtc 7680 atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc attctgagaa 7740 tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca 7800 catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg aaaactctca 7860 aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc caactgatct 7920 tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc 7980 gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt cctttttcaa 8040 tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt 8100 tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctaaattg 8160 taagcgttaa tattttgtta aaattcgcgt taaatttttg ttaaatcagc tcatttttta 8220 accaataggc cgaaatcggc aaaatccctt ataaatcaaa agaatagacc gagatagggt 8280 tgagtgttgt tccagtttgg aacaagagtc cactattaaa gaacgtggac tccaacgtca 8340 aagggcgaaa aaccgtctat cagggcgatg gcccactacg tgaaccatca ccctaatcaa 8400 gttttttggg gtcgaggtgc cgtaaagcac taaatcggaa ccctaaaggg agcccccgat 8460 ttagagcttg acggggaaag ccaacctggc ttatcgaaat taatacgact cactataggg 8520 agaccggc 8528 12 10112 DNA Artificial Sequence Description of Artificial Sequence pESYNGP, codon-optimised EIAV gag/pol expression plasmid 12 tcaatattgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 60 ttggccattg catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120 aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180 gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240 gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300 agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360 ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420 cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480 gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 540 caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600 caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaacaactg 660 cgatcgcccg ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga ggtctatata 720 agcagagctc gtttagtgaa ccgtcagatc actagaagct ttattgcggt agtttatcac 780 agttaaattg ctaacgcagt cagtgcttct gacacaacag tctcgaactt aagctgcagt 840 gactctctta aggtagcctt gcagaagttg gtcgtgaggc actgggcagg taagtatcaa 900 ggttacaaga caggtttaag gagaccaata gaaactgggc ttgtcgagac agagaagact 960 cttgcgtttc tgataggcac ctattggtct tactgacatc cactttgcct ttctctccac 1020 aggtgtccac tcccagttca attacagctc ttaaggctag agtacttaat acgactcact 1080 ataggctaga gaattcgcca ccatgggcga tcccctcacc tggtccaaag ccctgaagaa 1140 actggaaaaa gtcaccgttc agggtagcca aaagcttacc acaggcaatt gcaactgggc 1200 attgtccctg gtggatcttt tccacgacac taatttcgtt aaggagaaag attggcaact 1260 cagagacgtg atccccctct tggaggacgt gacccaaaca ttgtctgggc aggagcgcga 1320 agctttcgag cgcacctggt gggccatcag cgcagtcaaa atggggctgc aaatcaacaa 1380 cgtggttgac ggtaaagcta gctttcaact gctccgcgct aagtacgaga agaaaaccgc 1440 caacaagaaa caatccgaac ctagcgagga gtacccaatt atgatcgacg gcgccggcaa 1500 taggaacttc cgcccactga ctcccagggg ctataccacc tgggtcaaca ccatccagac 1560 aaacggactt ttgaacgaag cctcccagaa cctgttcggc atcctgtctg tggactgcac 1620 ctccgaagaa atgaatgctt ttctcgacgt ggtgccagga caggctggac agaaacagat 1680 cctgctcgat gccattgaca agatcgccga cgactgggat aatcgccacc ccctgccaaa 1740 cgcccctctg gtggctcccc cacaggggcc tatccctatg accgctaggt tcattagggg 1800 actgggggtg ccccgcgaac gccagatgga gccagcattt gaccaattta ggcagaccta 1860 cagacagtgg atcatcgaag ccatgagcga ggggattaaa gtcatgatcg gaaagcccaa 1920 ggcacagaac atcaggcagg gggccaagga accataccct gagtttgtcg acaggcttct 1980 gtcccagatt aaatccgaag gccaccctca ggagatctcc aagttcttga cagacacact 2040 gactatccaa aatgcaaatg aagagtgcag aaacgccatg aggcacctca gacctgaaga 2100 taccctggag gagaaaatgt acgcatgtcg cgacattggc actaccaagc aaaagatgat 2160 gctgctcgcc aaggctctgc aaaccggcct ggctggtcca ttcaaaggag gagcactgaa 2220 gggaggtcca ttgaaagctg cacaaacatg ttataattgt gggaagccag gacatttatc 2280 tagtcaatgt agagcaccta aagtctgttt taaatgtaaa cagcctggac atttctcaaa 2340 gcaatgcaga agtgttccaa aaaacgggaa gcaaggggct caagggaggc cccagaaaca 2400 aactttcccg atacaacaga agagtcagca caacaaatct gttgtacaag agactcctca 2460 gactcaaaat ctgtacccag atctgagcga aataaaaaag gaatacaatg tcaaggagaa 2520 ggatcaagta gaggatctca acctggacag tttgtgggag taacatacaa tctcgagaag 2580 aggcccacta ccatcgtcct gatcaatgac acccctctta atgtgctgct ggacaccgga 2640 gccgacacca gcgttctcac tactgctcac tataacagac tgaaatacag aggaaggaaa 2700 taccagggca caggcatcat cggcgttgga ggcaacgtcg aaaccttttc cactcctgtc 2760 accatcaaaa agaaggggag acacattaaa accagaatgc tggtcgccga catccccgtc 2820 accatccttg gcagagacat tctccaggac ctgggcgcta aactcgtgct ggcacaactg 2880 tctaaggaaa tcaagttccg caagatcgag ctgaaagagg gcacaatggg tccaaaaatc 2940 ccccagtggc ccctgaccaa agagaagctt gagggcgcta aggaaatcgt gcagcgcctg 3000 ctttctgagg gcaagattag cgaggccagc gacaataacc cttacaacag ccccatcttt 3060 gtgattaaga aaaggagcgg caaatggaga ctcctgcagg acctgaggga actcaacaag 3120 accgtccagg tcggaactga gatctctcgc ggactgcctc accccggcgg cctgattaaa 3180 tgcaagcaca tgacagtcct tgacattgga gacgcttatt ttaccatccc cctcgatcct 3240 gaatttcgcc cctatactgc ttttaccatc cccagcatca atcaccagga gcccgataaa 3300 cgctatgtgt ggaagtgcct cccccaggga tttgtgctta gcccctacat ttaccagaag 3360 acacttcaag agatcctcca acctttccgc gaaagatacc cagaggttca actctaccaa 3420 tatatggacg acctgttcat ggggtccaac gggtctaaga agcagcacaa ggaactcatc 3480 atcgaactga gggcaatcct cctggagaaa ggcttcgaga cacccgacga caagctgcaa 3540 gaagttcctc catatagctg gctgggctac cagctttgcc ctgaaaactg gaaagtccag 3600 aagatgcagt tggatatggt caagaaccca acactgaacg acgtccagaa gctcatgggc 3660 aatattacct ggatgagctc cggaatccct gggcttaccg ttaagcacat tgccgcaact 3720 acaaaaggat gcctggagtt gaaccagaag gtcatttgga cagaggaagc tcagaaggaa 3780 ctggaggaga ataatgaaaa gattaagaat gctcaagggc tccaatacta caatcccgaa 3840 gaagaaatgt tgtgcgaggt cgaaatcact aagaactacg aagccaccta tgtcatcaaa 3900 cagtcccaag gcatcttgtg ggccggaaag aaaatcatga aggccaacaa aggctggtcc 3960 accgttaaaa atctgatgct cctgctccag cacgtcgcca ccgagtctat cacccgcgtc 4020 ggcaagtgcc ccaccttcaa agttcccttc actaaggagc aggtgatgtg ggagatgcaa 4080 aaaggctggt actactcttg gcttcccgag atcgtctaca cccaccaagt ggtgcacgac 4140 gactggagaa tgaagcttgt cgaggagccc actagcggaa ttacaatcta taccgacggc 4200 ggaaagcaaa acggagaggg aatcgctgca tacgtcacat ctaacggccg caccaagcaa 4260 aagaggctcg gccctgtcac tcaccaggtg gctgagagga tggctatcca gatggccctt 4320 gaggacacta gagacaagca ggtgaacatt gtgactgaca gctactactg ctggaaaaac 4380 atcacagagg gccttggcct ggagggaccc cagtctccct ggtggcctat catccagaat 4440 atccgcgaaa aggaaattgt ctatttcgcc tgggtgcctg gacacaaagg aatttacggc 4500 aaccaactcg ccgatgaagc cgccaaaatt aaagaggaaa tcatgcttgc ctaccagggc 4560 acacagatta aggagaagag agacgaggac gctggctttg acctgtgtgt gccatacgac 4620 atcatgattc ccgttagcga cacaaagatc attccaaccg atgtcaagat ccaggtgcca 4680 cccaattcat ttggttgggt gaccggaaag tccagcatgg ctaagcaggg tcttctgatt 4740 aacgggggaa tcattgatga aggatacacc ggcgaaatcc aggtgatctg cacaaatatc 4800 ggcaaaagca atattaagct tatcgaaggg cagaagttcg ctcaactcat catcctccag 4860 caccacagca attcaagaca accttgggac gaaaacaaga ttagccagag aggtgacaag 4920 ggcttcggca gcacaggtgt gttctgggtg gagaacatcc aggaagcaca ggacgagcac 4980 gagaattggc acacctcccc taagattttg gcccgcaatt acaagatccc actgactgtg 5040 gctaagcaga tcacacagga atgcccccac tgcaccaaac aaggttctgg ccccgccggc 5100 tgcgtgatga ggtcccccaa tcactggcag gcagattgca cccacctcga caacaaaatt 5160 atcctgacct tcgtggagag caattccggc tacatccacg caacactcct ctccaaggaa 5220 aatgcattgt gcacctccct cgcaattctg gaatgggcca ggctgttctc tccaaaatcc 5280 ctgcacaccg acaacggcac caactttgtg gctgaacctg tggtgaatct gctgaagttc 5340 ctgaaaatcg cccacaccac tggcattccc tatcaccctg aaagccaggg cattgtcgag 5400 agggccaaca gaactctgaa agaaaagatc caatctcaca gagacaatac acagacattg 5460 gaggccgcac ttcagctcgc ccttatcacc tgcaacaaag gaagagaaag catgggcggc 5520 cagaccccct gggaggtctt catcactaac caggcccagg tcatccatga aaagctgctc 5580 ttgcagcagg cccagtcctc caaaaagttc tgcttttata agatccccgg tgagcacgac 5640 tggaaaggtc ctacaagagt tttgtggaaa ggagacggcg cagttgtggt gaacgatgag 5700 ggcaagggga tcatcgctgt gcccctgaca cgcaccaagc ttctcatcaa gccaaactga 5760 acccggggcg gccgcttccc tttagtgagg gttaatgctt cgagcagaca tgataagata 5820 cattgatgag tttggacaaa ccacaactag aatgcagtga aaaaaatgct ttatttgtga 5880 aatttgtgat gctattgctt tatttgtaac cattataagc tgcaataaac aagttaacaa 5940 caacaattgc attcatttta tgtttcaggt tcagggggag atgtgggagg ttttttaaag 6000 caagtaaaac ctctacaaat gtggtaaaat ccgataagga tcgatccggg ctggcgtaat 6060 agcgaagagg cccgcaccga tcgcccttcc caacagttgc gcagcctgaa tggcgaatgg 6120 acgcgccctg tagcggcgca ttaagcgcgg cgggtgtggt ggttacgcgc agcgtgaccg 6180 ctacacttgc cagcgcccta gcgcccgctc ctttcgcttt cttcccttcc tttctcgcca 6240 cgttcgccgg ctttccccgt caagctctaa atcgggggct ccctttaggg ttccgattta 6300 gagctttacg gcacctcgac cgcaaaaaac ttgatttggg tgatggttca cgtagtgggc 6360 catcgccctg atagacggtt tttcgccctt tgacgttgga gtccacgttc tttaatagtg 6420 gactcttgtt ccaaactgga acaacactca accctatctc ggtctattct tttgatttat 6480 aagggatttt gccgatttcg gcctattggt taaaaaatga gctgatttaa caaatattta 6540 acgcgaattt taacaaaata ttaacgttta caatttcgcc tgatgcggta ttttctcctt 6600 acgcatctgt gcggtatttc acaccgcata cgcggatctg cgcagcacca tggcctgaaa 6660 taacctctga aagaggaact tggttaggta ccttctgagg cggaaagaac cagctgtgga 6720 atgtgtgtca gttagggtgt ggaaagtccc caggctcccc agcaggcaga agtatgcaaa 6780 gcatgcatct caattagtca gcaaccaggt gtggaaagtc cccaggctcc ccagcaggca 6840 gaagtatgca aagcatgcat ctcaattagt cagcaaccat agtcccgccc ctaactccgc 6900 ccatcccgcc cctaactccg cccagttccg cccattctcc gccccatggc tgactaattt 6960 tttttattta tgcagaggcc gaggccgcct cggcctctga gctattccag aagtagtgag 7020 gaggcttttt tggaggccta ggcttttgca aaaagcttga ttcttctgac acaacagtct 7080 cgaacttaag gctagagcca ccatgattga acaagatgga ttgcacgcag gttctccggc 7140 cgcttgggtg gagaggctat tcggctatga ctgggcacaa cagacaatcg gctgctctga 7200 tgccgccgtg ttccggctgt cagcgcaggg gcgcccggtt ctttttgtca agaccgacct 7260 gtccggtgcc ctgaatgaac tgcaggacga ggcagcgcgg ctatcgtggc tggccacgac 7320 gggcgttcct tgcgcagctg tgctcgacgt tgtcactgaa gcgggaaggg actggctgct 7380 attgggcgaa gtgccggggc aggatctcct gtcatctcac cttgctcctg ccgagaaagt 7440 atccatcatg gctgatgcaa tgcggcggct gcatacgctt gatccggcta cctgcccatt 7500 cgaccaccaa gcgaaacatc gcatcgagcg agcacgtact cggatggaag ccggtcttgt 7560 cgatcaggat gatctggacg aagagcatca ggggctcgcg ccagccgaac tgttcgccag 7620 gctcaaggcg cgcatgcccg acggcgagga tctcgtcgtg acccatggcg atgcctgctt 7680 gccgaatatc atggtggaaa atggccgctt ttctggattc atcgactgtg gccggctggg 7740 tgtggcggac cgctatcagg acatagcgtt ggctacccgt gatattgctg aagagcttgg 7800 cggcgaatgg gctgaccgct tcctcgtgct ttacggtatc gccgctcccg attcgcagcg 7860 catcgccttc tatcgccttc ttgacgagtt cttctgagcg ggactctggg gttcgaaatg 7920 accgaccaag cgacgcccaa cctgccatca cgatggccgc aataaaatat ctttattttc 7980 attacatctg tgtgttggtt ttttgtgtga atcgatagcg ataaggatcc gcgtatggtg 8040 cactctcagt acaatctgct ctgatgccgc atagttaagc cagccccgac acccgccaac 8100 acccgctgac gcgccctgac gggcttgtct gctcccggca tccgcttaca gacaagctgt 8160 gaccgtctcc gggagctgca tgtgtcagag gttttcaccg tcatcaccga aacgcgcgag 8220 acgaaagggc ctcgtgatac gcctattttt ataggttaat gtcatgataa taatggtttc 8280 ttagacgtca ggtggcactt ttcggggaaa tgtgcgcgga acccctattt gtttattttt 8340 ctaaatacat tcaaatatgt atccgctcat gagacaataa ccctgataaa tgcttcaata 8400 atattgaaaa aggaagagta tgagtattca acatttccgt gtcgccctta ttcccttttt 8460 tgcggcattt tgccttcctg tttttgctca cccagaaacg ctggtgaaag taaaagatgc 8520 tgaagatcag ttgggtgcac gagtgggtta catcgaactg gatctcaaca gcggtaagat 8580 ccttgagagt tttcgccccg aagaacgttt tccaatgatg agcactttta aagttctgct 8640 atgtggcgcg gtattatccc gtattgacgc cgggcaagag caactcggtc gccgcataca 8700 ctattctcag aatgacttgg ttgagtactc accagtcaca gaaaagcatc ttacggatgg 8760 catgacagta agagaattat gcagtgctgc cataaccatg agtgataaca ctgcggccaa 8820 cttacttctg acaacgatcg gaggaccgaa ggagctaacc gcttttttgc acaacatggg 8880 ggatcatgta actcgccttg atcgttggga accggagctg aatgaagcca taccaaacga 8940 cgagcgtgac accacgatgc ctgtagcaat ggcaacaacg ttgcgcaaac tattaactgg 9000 cgaactactt actctagctt cccggcaaca attaatagac tggatggagg cggataaagt 9060 tgcaggacca cttctgcgct cggcccttcc ggctggctgg tttattgctg ataaatctgg 9120 agccggtgag cgtgggtctc gcggtatcat tgcagcactg gggccagatg gtaagccctc 9180 ccgtatcgta gttatctaca cgacggggag tcaggcaact atggatgaac gaaatagaca 9240 gatcgctgag ataggtgcct cactgattaa gcattggtaa ctgtcagacc aagtttactc 9300 atatatactt tagattgatt taaaacttca tttttaattt aaaaggatct aggtgaagat 9360 cctttttgat aatctcatga ccaaaatccc ttaacgtgag ttttcgttcc actgagcgtc 9420 agaccccgta gaaaagatca aaggatcttc ttgagatcct ttttttctgc gcgtaatctg 9480 ctgcttgcaa acaaaaaaac caccgctacc agcggtggtt tgtttgccgg atcaagagct 9540 accaactctt tttccgaagg taactggctt cagcagagcg cagataccaa atactgtcct 9600 tctagtgtag ccgtagttag gccaccactt caagaactct gtagcaccgc ctacatacct 9660 cgctctgcta atcctgttac cagtggctgc tgccagtggc gataagtcgt gtcttaccgg 9720 gttggactca agacgatagt taccggataa ggcgcagcgg tcgggctgaa cggggggttc 9780 gtgcacacag cccagcttgg agcgaacgac ctacaccgaa ctgagatacc tacagcgtga 9840 gctatgagaa agcgccacgc ttcccgaagg gagaaaggcg gacaggtatc cggtaagcgg 9900 cagggtcgga acaggagagc gcacgaggga gcttccaggg ggaaacgcct ggtatcttta 9960 tagtcctgtc gggtttcgcc acctctgact tgagcgtcga tttttgtgat gctcgtcagg 10020 ggggcggagc ctatggaaaa acgccagcaa cgcggccttt ttacggttcc tggccttttg 10080 ctggcctttt gctcacatgg ctcgacagat ct 10112 13 10114 DNA Artificial Sequence Description of Artificial Sequence pESDSYNGP, codon-optimised EIAV gag/pol expression plasmid 13 tcaatattgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 60 ttggccattg catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120 aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180 gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240 gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300 agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360 ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420 cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480 gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 540 caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600 caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaacaactg 660 cgatcgcccg ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga ggtctatata 720 agcagagctc gtttagtgaa ccgtcagatc actagaagct ttattgcggt agtttatcac 780 agttaaattg ctaacgcagt cagtgcttct gacacaacag tctcgaactt aagctgcagt 840 gactctctta aggtagcctt gcagaagttg gtcgtgaggc actgggcagg taagtatcaa 900 ggttacaaga caggtttaag gagaccaata gaaactgggc ttgtcgagac agagaagact 960 cttgcgtttc tgataggcac ctattggtct tactgacatc cactttgcct ttctctccac 1020 aggtgtccac tcccagttca attacagctc ttaaggctag agtacttaat acgactcact 1080 ataggctaga gaattccagg taagatgggc gatcccctca cctggtccaa agccctgaag 1140 aaactggaaa aagtcaccgt tcagggtagc caaaagctta ccacaggcaa ttgcaactgg 1200 gcattgtccc tggtggatct tttccacgac actaatttcg ttaaggagaa agattggcaa 1260 ctcagagacg tgatccccct cttggaggac gtgacccaaa cattgtctgg gcaggagcgc 1320 gaagctttcg agcgcacctg gtgggccatc agcgcagtca aaatggggct gcaaatcaac 1380 aacgtggttg acggtaaagc tagctttcaa ctgctccgcg ctaagtacga gaagaaaacc 1440 gccaacaaga aacaatccga acctagcgag gagtacccaa ttatgatcga cggcgccggc 1500 aataggaact tccgcccact gactcccagg ggctatacca cctgggtcaa caccatccag 1560 acaaacggac ttttgaacga agcctcccag aacctgttcg gcatcctgtc tgtggactgc 1620 acctccgaag aaatgaatgc ttttctcgac gtggtgccag gacaggctgg acagaaacag 1680 atcctgctcg atgccattga caagatcgcc gacgactggg ataatcgcca ccccctgcca 1740 aacgcccctc tggtggctcc cccacagggg cctatcccta tgaccgctag gttcattagg 1800 ggactggggg tgccccgcga acgccagatg gagccagcat ttgaccaatt taggcagacc 1860 tacagacagt ggatcatcga agccatgagc gaggggatta aagtcatgat cggaaagccc 1920 aaggcacaga acatcaggca gggggccaag gaaccatacc ctgagtttgt cgacaggctt 1980 ctgtcccaga ttaaatccga aggccaccct caggagatct ccaagttctt gacagacaca 2040 ctgactatcc aaaatgcaaa tgaagagtgc agaaacgcca tgaggcacct cagacctgaa 2100 gataccctgg aggagaaaat gtacgcatgt cgcgacattg gcactaccaa gcaaaagatg 2160 atgctgctcg ccaaggctct gcaaaccggc ctggctggtc cattcaaagg aggagcactg 2220 aagggaggtc cattgaaagc tgcacaaaca tgttataatt gtgggaagcc aggacattta 2280 tctagtcaat gtagagcacc taaagtctgt tttaaatgta aacagcctgg acatttctca 2340 aagcaatgca gaagtgttcc aaaaaacggg aagcaagggg ctcaagggag gccccagaaa 2400 caaactttcc cgatacaaca gaagagtcag cacaacaaat ctgttgtaca agagactcct 2460 cagactcaaa atctgtaccc agatctgagc gaaataaaaa aggaatacaa tgtcaaggag 2520 aaggatcaag tagaggatct caacctggac agtttgtggg agtaacatac aatctcgaga 2580 agaggcccac taccatcgtc ctgatcaatg acacccctct taatgtgctg ctggacaccg 2640 gagccgacac cagcgttctc actactgctc actataacag actgaaatac agaggaagga 2700 aataccaggg cacaggcatc atcggcgttg gaggcaacgt cgaaaccttt tccactcctg 2760 tcaccatcaa aaagaagggg agacacatta aaaccagaat gctggtcgcc gacatccccg 2820 tcaccatcct tggcagagac attctccagg acctgggcgc taaactcgtg ctggcacaac 2880 tgtctaagga aatcaagttc cgcaagatcg agctgaaaga gggcacaatg ggtccaaaaa 2940 tcccccagtg gcccctgacc aaagagaagc ttgagggcgc taaggaaatc gtgcagcgcc 3000 tgctttctga gggcaagatt agcgaggcca gcgacaataa cccttacaac agccccatct 3060 ttgtgattaa gaaaaggagc ggcaaatgga gactcctgca ggacctgagg gaactcaaca 3120 agaccgtcca ggtcggaact gagatctctc gcggactgcc tcaccccggc ggcctgatta 3180 aatgcaagca catgacagtc cttgacattg gagacgctta ttttaccatc cccctcgatc 3240 ctgaatttcg cccctatact gcttttacca tccccagcat caatcaccag gagcccgata 3300 aacgctatgt gtggaagtgc ctcccccagg gatttgtgct tagcccctac atttaccaga 3360 agacacttca agagatcctc caacctttcc gcgaaagata cccagaggtt caactctacc 3420 aatatatgga cgacctgttc atggggtcca acgggtctaa gaagcagcac aaggaactca 3480 tcatcgaact gagggcaatc ctcctggaga aaggcttcga gacacccgac gacaagctgc 3540 aagaagttcc tccatatagc tggctgggct accagctttg ccctgaaaac tggaaagtcc 3600 agaagatgca gttggatatg gtcaagaacc caacactgaa cgacgtccag aagctcatgg 3660 gcaatattac ctggatgagc tccggaatcc ctgggcttac cgttaagcac attgccgcaa 3720 ctacaaaagg atgcctggag ttgaaccaga aggtcatttg gacagaggaa gctcagaagg 3780 aactggagga gaataatgaa aagattaaga atgctcaagg gctccaatac tacaatcccg 3840 aagaagaaat gttgtgcgag gtcgaaatca ctaagaacta cgaagccacc tatgtcatca 3900 aacagtccca aggcatcttg tgggccggaa agaaaatcat gaaggccaac aaaggctggt 3960 ccaccgttaa aaatctgatg ctcctgctcc agcacgtcgc caccgagtct atcacccgcg 4020 tcggcaagtg ccccaccttc aaagttccct tcactaagga gcaggtgatg tgggagatgc 4080 aaaaaggctg gtactactct tggcttcccg agatcgtcta cacccaccaa gtggtgcacg 4140 acgactggag aatgaagctt gtcgaggagc ccactagcgg aattacaatc tataccgacg 4200 gcggaaagca aaacggagag ggaatcgctg catacgtcac atctaacggc cgcaccaagc 4260 aaaagaggct cggccctgtc actcaccagg tggctgagag gatggctatc cagatggccc 4320 ttgaggacac tagagacaag caggtgaaca ttgtgactga cagctactac tgctggaaaa 4380 acatcacaga gggccttggc ctggagggac cccagtctcc ctggtggcct atcatccaga 4440 atatccgcga aaaggaaatt gtctatttcg cctgggtgcc tggacacaaa ggaatttacg 4500 gcaaccaact cgccgatgaa gccgccaaaa ttaaagagga aatcatgctt gcctaccagg 4560 gcacacagat taaggagaag agagacgagg acgctggctt tgacctgtgt gtgccatacg 4620 acatcatgat tcccgttagc gacacaaaga tcattccaac cgatgtcaag atccaggtgc 4680 cacccaattc atttggttgg gtgaccggaa agtccagcat ggctaagcag ggtcttctga 4740 ttaacggggg aatcattgat gaaggataca ccggcgaaat ccaggtgatc tgcacaaata 4800 tcggcaaaag caatattaag cttatcgaag ggcagaagtt cgctcaactc atcatcctcc 4860 agcaccacag caattcaaga caaccttggg acgaaaacaa gattagccag agaggtgaca 4920 agggcttcgg cagcacaggt gtgttctggg tggagaacat ccaggaagca caggacgagc 4980 acgagaattg gcacacctcc cctaagattt tggcccgcaa ttacaagatc ccactgactg 5040 tggctaagca gatcacacag gaatgccccc actgcaccaa acaaggttct ggccccgccg 5100 gctgcgtgat gaggtccccc aatcactggc aggcagattg cacccacctc gacaacaaaa 5160 ttatcctgac cttcgtggag agcaattccg gctacatcca cgcaacactc ctctccaagg 5220 aaaatgcatt gtgcacctcc ctcgcaattc tggaatgggc caggctgttc tctccaaaat 5280 ccctgcacac cgacaacggc accaactttg tggctgaacc tgtggtgaat ctgctgaagt 5340 tcctgaaaat cgcccacacc actggcattc cctatcaccc tgaaagccag ggcattgtcg 5400 agagggccaa cagaactctg aaagaaaaga tccaatctca cagagacaat acacagacat 5460 tggaggccgc acttcagctc gcccttatca cctgcaacaa aggaagagaa agcatgggcg 5520 gccagacccc ctgggaggtc ttcatcacta accaggccca ggtcatccat gaaaagctgc 5580 tcttgcagca ggcccagtcc tccaaaaagt tctgctttta taagatcccc ggtgagcacg 5640 actggaaagg tcctacaaga gttttgtgga aaggagacgg cgcagttgtg gtgaacgatg 5700 agggcaaggg gatcatcgct gtgcccctga cacgcaccaa gcttctcatc aagccaaact 5760 gaacccgggg cggccgcttc cctttagtga gggttaatgc ttcgagcaga catgataaga 5820 tacattgatg agtttggaca aaccacaact agaatgcagt gaaaaaaatg ctttatttgt 5880 gaaatttgtg atgctattgc tttatttgta accattataa gctgcaataa acaagttaac 5940 aacaacaatt gcattcattt tatgtttcag gttcaggggg agatgtggga ggttttttaa 6000 agcaagtaaa acctctacaa atgtggtaaa atccgataag gatcgatccg ggctggcgta 6060 atagcgaaga ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat 6120 ggacgcgccc tgtagcggcg cattaagcgc ggcgggtgtg gtggttacgc gcagcgtgac 6180 cgctacactt gccagcgccc tagcgcccgc tcctttcgct ttcttccctt cctttctcgc 6240 cacgttcgcc ggctttcccc gtcaagctct aaatcggggg ctccctttag ggttccgatt 6300 tagagcttta cggcacctcg accgcaaaaa acttgatttg ggtgatggtt cacgtagtgg 6360 gccatcgccc tgatagacgg tttttcgccc tttgacgttg gagtccacgt tctttaatag 6420 tggactcttg ttccaaactg gaacaacact caaccctatc tcggtctatt cttttgattt 6480 ataagggatt ttgccgattt cggcctattg gttaaaaaat gagctgattt aacaaatatt 6540 taacgcgaat tttaacaaaa tattaacgtt tacaatttcg cctgatgcgg tattttctcc 6600 ttacgcatct gtgcggtatt tcacaccgca tacgcggatc tgcgcagcac catggcctga 6660 aataacctct gaaagaggaa cttggttagg taccttctga ggcggaaaga accagctgtg 6720 gaatgtgtgt cagttagggt gtggaaagtc cccaggctcc ccagcaggca gaagtatgca 6780 aagcatgcat ctcaattagt cagcaaccag gtgtggaaag tccccaggct ccccagcagg 6840 cagaagtatg caaagcatgc atctcaatta gtcagcaacc atagtcccgc ccctaactcc 6900 gcccatcccg cccctaactc cgcccagttc cgcccattct ccgccccatg gctgactaat 6960 tttttttatt tatgcagagg ccgaggccgc ctcggcctct gagctattcc agaagtagtg 7020 aggaggcttt tttggaggcc taggcttttg caaaaagctt gattcttctg acacaacagt 7080 ctcgaactta aggctagagc caccatgatt gaacaagatg gattgcacgc aggttctccg 7140 gccgcttggg tggagaggct attcggctat gactgggcac aacagacaat cggctgctct 7200 gatgccgccg tgttccggct gtcagcgcag gggcgcccgg ttctttttgt caagaccgac 7260 ctgtccggtg ccctgaatga actgcaggac gaggcagcgc ggctatcgtg gctggccacg 7320 acgggcgttc cttgcgcagc tgtgctcgac gttgtcactg aagcgggaag ggactggctg 7380 ctattgggcg aagtgccggg gcaggatctc ctgtcatctc accttgctcc tgccgagaaa 7440 gtatccatca tggctgatgc aatgcggcgg ctgcatacgc ttgatccggc tacctgccca 7500 ttcgaccacc aagcgaaaca tcgcatcgag cgagcacgta ctcggatgga agccggtctt 7560 gtcgatcagg atgatctgga cgaagagcat caggggctcg cgccagccga actgttcgcc 7620 aggctcaagg cgcgcatgcc cgacggcgag gatctcgtcg tgacccatgg cgatgcctgc 7680 ttgccgaata tcatggtgga aaatggccgc ttttctggat tcatcgactg tggccggctg 7740 ggtgtggcgg accgctatca ggacatagcg ttggctaccc gtgatattgc tgaagagctt 7800 ggcggcgaat gggctgaccg cttcctcgtg ctttacggta tcgccgctcc cgattcgcag 7860 cgcatcgcct tctatcgcct tcttgacgag ttcttctgag cgggactctg gggttcgaaa 7920 tgaccgacca agcgacgccc aacctgccat cacgatggcc gcaataaaat atctttattt 7980 tcattacatc tgtgtgttgg ttttttgtgt gaatcgatag cgataaggat ccgcgtatgg 8040 tgcactctca gtacaatctg ctctgatgcc gcatagttaa gccagccccg acacccgcca 8100 acacccgctg acgcgccctg acgggcttgt ctgctcccgg catccgctta cagacaagct 8160 gtgaccgtct ccgggagctg catgtgtcag aggttttcac cgtcatcacc gaaacgcgcg 8220 agacgaaagg gcctcgtgat acgcctattt ttataggtta atgtcatgat aataatggtt 8280 tcttagacgt caggtggcac ttttcgggga aatgtgcgcg gaacccctat ttgtttattt 8340 ttctaaatac attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa 8400 taatattgaa aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt 8460 tttgcggcat tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat 8520 gctgaagatc agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag 8580 atccttgaga gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg 8640 ctatgtggcg cggtattatc ccgtattgac gccgggcaag agcaactcgg tcgccgcata 8700 cactattctc agaatgactt ggttgagtac tcaccagtca cagaaaagca tcttacggat 8760 ggcatgacag taagagaatt atgcagtgct gccataacca tgagtgataa cactgcggcc 8820 aacttacttc tgacaacgat cggaggaccg aaggagctaa ccgctttttt gcacaacatg 8880 ggggatcatg taactcgcct tgatcgttgg gaaccggagc tgaatgaagc cataccaaac 8940 gacgagcgtg acaccacgat gcctgtagca atggcaacaa cgttgcgcaa actattaact 9000 ggcgaactac ttactctagc ttcccggcaa caattaatag actggatgga ggcggataaa 9060 gttgcaggac cacttctgcg ctcggccctt ccggctggct ggtttattgc tgataaatct 9120 ggagccggtg agcgtgggtc tcgcggtatc attgcagcac tggggccaga tggtaagccc 9180 tcccgtatcg tagttatcta cacgacgggg agtcaggcaa ctatggatga acgaaataga 9240 cagatcgctg agataggtgc ctcactgatt aagcattggt aactgtcaga ccaagtttac 9300 tcatatatac tttagattga tttaaaactt catttttaat ttaaaaggat ctaggtgaag 9360 atcctttttg ataatctcat gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg 9420 tcagaccccg tagaaaagat caaaggatct tcttgagatc ctttttttct gcgcgtaatc 9480 tgctgcttgc aaacaaaaaa accaccgcta ccagcggtgg tttgtttgcc ggatcaagag 9540 ctaccaactc tttttccgaa ggtaactggc ttcagcagag cgcagatacc aaatactgtc 9600 cttctagtgt agccgtagtt aggccaccac ttcaagaact ctgtagcacc gcctacatac 9660 ctcgctctgc taatcctgtt accagtggct gctgccagtg gcgataagtc gtgtcttacc 9720 gggttggact caagacgata gttaccggat aaggcgcagc ggtcgggctg aacggggggt 9780 tcgtgcacac agcccagctt ggagcgaacg acctacaccg aactgagata cctacagcgt 9840 gagctatgag aaagcgccac gcttcccgaa gggagaaagg cggacaggta tccggtaagc 9900 ggcagggtcg gaacaggaga gcgcacgagg gagcttccag ggggaaacgc ctggtatctt 9960 tatagtcctg tcgggtttcg ccacctctga cttgagcgtc gatttttgtg atgctcgtca 10020 ggggggcgga gcctatggaa aaacgccagc aacgcggcct ttttacggtt cctggccttt 10080 tgctggcctt ttgctcacat ggctcgacag atct 10114 14 5993 DNA Artificial Sequence Description of Artificial Sequence pClneoERev, EIAV Rev expression plasmid 14 tcaatattgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 60 ttggccattg catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120 aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180 gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240 gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300 agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360 ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420 cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480 gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 540 caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600 caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaacaactg 660 cgatcgcccg ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga ggtctatata 720 agcagagctc gtttagtgaa ccgtcagatc actagaagct ttattgcggt agtttatcac 780 agttaaattg ctaacgcagt cagtgcttct gacacaacag tctcgaactt aagctgcagt 840 gactctctta aggtagcctt gcagaagttg gtcgtgaggc actgggcagg taagtatcaa 900 ggttacaaga caggtttaag gagaccaata gaaactgggc ttgtcgagac agagaagact 960 cttgcgtttc tgataggcac ctattggtct tactgacatc cactttgcct ttctctccac 1020 aggtgtccac tcccagttca attacagctc ttaaggctag agtacttaat acgactcact 1080 ataggctagt aacggccgcc agtgtgctgg aattcggctt atggcagaat cgaaggaagc 1140 aagagaccaa gaaatgaacc tgaaagaaga atctaaagaa gaaaaaagaa gaaatgactg 1200 gtggaaaata gatcctcagg gccctctgga aggtgaccag tggtgcaggg tcctccggca 1260 gtcgttacct gaagaaaaaa ttccatcaca aacatgcatc gcgagaagac acctgggacc 1320 aggcccaaca caacatacac ctagcaggcg tgaccggtgg atcaggggac aaatactaca 1380 agcagaagta ctccaggaac gactggaatg gagaatcaga ggagtacaac aggcggccaa 1440 agagctgggt gaagtcaatc gaggcatttg gagagagcta tatttccgag aagaccaaag 1500 gggagatttc tcagcctggg gcggctatca acgagcacaa gaacggctct ggggggaaca 1560 atcctcacca agggtcctta gacctggaga ttcgaagcga aggaggaaac atttatgaag 1620 ccgaattctg cagatatcca tcacactggc ggccgcttcc ctttagtgag ggttaatgct 1680 tcgagcagac atgataagat acattgatga gtttggacaa accacaacta gaatgcagtg 1740 aaaaaaatgc tttatttgtg aaatttgtga tgctattgct ttatttgtaa ccattataag 1800 ctgcaataaa caagttaaca acaacaattg cattcatttt atgtttcagg ttcaggggga 1860 gatgtgggag gttttttaaa gcaagtaaaa cctctacaaa tgtggtaaaa tccgataagg 1920 atcgatccgg gctggcgtaa tagcgaagag gcccgcaccg atcgcccttc ccaacagttg 1980 cgcagcctga atggcgaatg gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg 2040 tggttacgcg cagcgtgacc gctacacttg ccagcgccct agcgcccgct cctttcgctt 2100 tcttcccttc ctttctcgcc acgttcgccg gctttccccg tcaagctcta aatcgggggc 2160 tccctttagg gttccgattt agagctttac ggcacctcga ccgcaaaaaa cttgatttgg 2220 gtgatggttc acgtagtggg ccatcgccct gatagacggt ttttcgccct ttgacgttgg 2280 agtccacgtt ctttaatagt ggactcttgt tccaaactgg aacaacactc aaccctatct 2340 cggtctattc ttttgattta taagggattt tgccgatttc ggcctattgg ttaaaaaatg 2400 agctgattta acaaatattt aacgcgaatt ttaacaaaat attaacgttt acaatttcgc 2460 ctgatgcggt attttctcct tacgcatctg tgcggtattt cacaccgcat acgcggatct 2520 gcgcagcacc atggcctgaa ataacctctg aaagaggaac ttggttaggt accttctgag 2580 gcggaaagaa ccagctgtgg aatgtgtgtc agttagggtg tggaaagtcc ccaggctccc 2640 cagcaggcag aagtatgcaa agcatgcatc tcaattagtc agcaaccagg tgtggaaagt 2700 ccccaggctc cccagcaggc agaagtatgc aaagcatgca tctcaattag tcagcaacca 2760 tagtcccgcc cctaactccg cccatcccgc ccctaactcc gcccagttcc gcccattctc 2820 cgccccatgg ctgactaatt ttttttattt atgcagaggc cgaggccgcc tcggcctctg 2880 agctattcca gaagtagtga ggaggctttt ttggaggcct aggcttttgc aaaaagcttg 2940 attcttctga cacaacagtc tcgaacttaa ggctagagcc accatgattg aacaagatgg 3000 attgcacgca ggttctccgg ccgcttgggt ggagaggcta ttcggctatg actgggcaca 3060 acagacaatc ggctgctctg atgccgccgt gttccggctg tcagcgcagg ggcgcccggt 3120 tctttttgtc aagaccgacc tgtccggtgc cctgaatgaa ctgcaggacg aggcagcgcg 3180 gctatcgtgg ctggccacga cgggcgttcc ttgcgcagct gtgctcgacg ttgtcactga 3240 agcgggaagg gactggctgc tattgggcga agtgccgggg caggatctcc tgtcatctca 3300 ccttgctcct gccgagaaag tatccatcat ggctgatgca atgcggcggc tgcatacgct 3360 tgatccggct acctgcccat tcgaccacca agcgaaacat cgcatcgagc gagcacgtac 3420 tcggatggaa gccggtcttg tcgatcagga tgatctggac gaagagcatc aggggctcgc 3480 gccagccgaa ctgttcgcca ggctcaaggc gcgcatgccc gacggcgagg atctcgtcgt 3540 gacccatggc gatgcctgct tgccgaatat catggtggaa aatggccgct tttctggatt 3600 catcgactgt ggccggctgg gtgtggcgga ccgctatcag gacatagcgt tggctacccg 3660 tgatattgct gaagagcttg gcggcgaatg ggctgaccgc ttcctcgtgc tttacggtat 3720 cgccgctccc gattcgcagc gcatcgcctt ctatcgcctt cttgacgagt tcttctgagc 3780 gggactctgg ggttcgaaat gaccgaccaa gcgacgccca acctgccatc acgatggccg 3840 caataaaata tctttatttt cattacatct gtgtgttggt tttttgtgtg aatcgatagc 3900 gataaggatc cgcgtatggt gcactctcag tacaatctgc tctgatgccg catagttaag 3960 ccagccccga cacccgccaa cacccgctga cgcgccctga cgggcttgtc tgctcccggc 4020 atccgcttac agacaagctg tgaccgtctc cgggagctgc atgtgtcaga ggttttcacc 4080 gtcatcaccg aaacgcgcga gacgaaaggg cctcgtgata cgcctatttt tataggttaa 4140 tgtcatgata ataatggttt cttagacgtc aggtggcact tttcggggaa atgtgcgcgg 4200 aacccctatt tgtttatttt tctaaataca ttcaaatatg tatccgctca tgagacaata 4260 accctgataa atgcttcaat aatattgaaa aaggaagagt atgagtattc aacatttccg 4320 tgtcgccctt attccctttt ttgcggcatt ttgccttcct gtttttgctc acccagaaac 4380 gctggtgaaa gtaaaagatg ctgaagatca gttgggtgca cgagtgggtt acatcgaact 4440 ggatctcaac agcggtaaga tccttgagag ttttcgcccc gaagaacgtt ttccaatgat 4500 gagcactttt aaagttctgc tatgtggcgc ggtattatcc cgtattgacg ccgggcaaga 4560 gcaactcggt cgccgcatac actattctca gaatgacttg gttgagtact caccagtcac 4620 agaaaagcat cttacggatg gcatgacagt aagagaatta tgcagtgctg ccataaccat 4680 gagtgataac actgcggcca acttacttct gacaacgatc ggaggaccga aggagctaac 4740 cgcttttttg cacaacatgg gggatcatgt aactcgcctt gatcgttggg aaccggagct 4800 gaatgaagcc ataccaaacg acgagcgtga caccacgatg cctgtagcaa tggcaacaac 4860 gttgcgcaaa ctattaactg gcgaactact tactctagct tcccggcaac aattaataga 4920 ctggatggag gcggataaag ttgcaggacc acttctgcgc tcggcccttc cggctggctg 4980 gtttattgct gataaatctg gagccggtga gcgtgggtct cgcggtatca ttgcagcact 5040 ggggccagat ggtaagccct cccgtatcgt agttatctac acgacgggga gtcaggcaac 5100 tatggatgaa cgaaatagac agatcgctga gataggtgcc tcactgatta agcattggta 5160 actgtcagac caagtttact catatatact ttagattgat ttaaaacttc atttttaatt 5220 taaaaggatc taggtgaaga tcctttttga taatctcatg accaaaatcc cttaacgtga 5280 gttttcgttc cactgagcgt cagaccccgt agaaaagatc aaaggatctt cttgagatcc 5340 tttttttctg cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt 5400 ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct tcagcagagc 5460 gcagatacca aatactgtcc ttctagtgta gccgtagtta ggccaccact tcaagaactc 5520 tgtagcaccg cctacatacc tcgctctgct aatcctgtta ccagtggctg ctgccagtgg 5580 cgataagtcg tgtcttaccg ggttggactc aagacgatag ttaccggata aggcgcagcg 5640 gtcgggctga acggggggtt cgtgcacaca gcccagcttg gagcgaacga cctacaccga 5700 actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag ggagaaaggc 5760 ggacaggtat ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg agcttccagg 5820 gggaaacgcc tggtatcttt atagtcctgt cgggtttcgc cacctctgac ttgagcgtcg 5880 atttttgtga tgctcgtcag gggggcggag cctatggaaa aacgccagca acgcggcctt 5940 tttacggttc ctggcctttt gctggccttt tgctcacatg gctcgacaga tct 5993 15 5961 DNA Artificial Sequence Description of Artificial Sequence pESYNREV, codon-optimised EIAV Rev expression plasmid 15 tcaatattgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 60 ttggccattg catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120 aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180 gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240 gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300 agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360 ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420 cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480 gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 540 caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600 caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaacaactg 660 cgatcgcccg ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga ggtctatata 720 agcagagctc gtttagtgaa ccgtcagatc actagaagct ttattgcggt agtttatcac 780 agttaaattg ctaacgcagt cagtgcttct gacacaacag tctcgaactt aagctgcagt 840 gactctctta aggtagcctt gcagaagttg gtcgtgaggc actgggcagg taagtatcaa 900 ggttacaaga caggtttaag gagaccaata gaaactgggc ttgtcgagac agagaagact 960 cttgcgtttc tgataggcac ctattggtct tactgacatc cactttgcct ttctctccac 1020 aggtgtccac tcccagttca attacagctc ttaaggctag agtacttaat acgactcact 1080 ataggctagc ctcgagaatt cgccaccatg gctgagagca aggaggccag ggatcaagag 1140 atgaacctca aggaagagag caaagaggag aagcgccgca acgactggtg gaagatcgac 1200 ccacaaggcc ccctggaggg ggaccagtgg tgccgcgtgc tgagacagtc cctgcccgag 1260 gagaagattc ctagccagac ctgcatcgcc agaagacacc tcggccccgg tcccacccag 1320 cacacaccct ccagaaggga taggtggatt aggggccaga ttttgcaagc cgaggtcctc 1380 caagaaaggc tggaatggag aattaggggc gtgcaacaag ccgctaaaga gctgggagag 1440 gtgaatcgcg gcatctggag ggagctctac ttccgcgagg accagagggg cgatttctcc 1500 gcatggggag gctaccagag ggcacaagaa aggctgtggg gcgagcagag cagcccccgc 1560 gtcttgaggc ccggagactc caaaagacgc cgcaaacacc tgtgaagtcg acccgggcgg 1620 ccgcttccct ttagtgaggg ttaatgcttc gagcagacat gataagatac attgatgagt 1680 ttggacaaac cacaactaga atgcagtgaa aaaaatgctt tatttgtgaa atttgtgatg 1740 ctattgcttt atttgtaacc attataagct gcaataaaca agttaacaac aacaattgca 1800 ttcattttat gtttcaggtt cagggggaga tgtgggaggt tttttaaagc aagtaaaacc 1860 tctacaaatg tggtaaaatc cgataaggat cgatccgggc tggcgtaata gcgaagaggc 1920 ccgcaccgat cgcccttccc aacagttgcg cagcctgaat ggcgaatgga cgcgccctgt 1980 agcggcgcat taagcgcggc gggtgtggtg gttacgcgca gcgtgaccgc tacacttgcc 2040 agcgccctag cgcccgctcc tttcgctttc ttcccttcct ttctcgccac gttcgccggc 2100 tttccccgtc aagctctaaa tcgggggctc cctttagggt tccgatttag agctttacgg 2160 cacctcgacc gcaaaaaact tgatttgggt gatggttcac gtagtgggcc atcgccctga 2220 tagacggttt ttcgcccttt gacgttggag tccacgttct ttaatagtgg actcttgttc 2280 caaactggaa caacactcaa ccctatctcg gtctattctt ttgatttata agggattttg 2340 ccgatttcgg cctattggtt aaaaaatgag ctgatttaac aaatatttaa cgcgaatttt 2400 aacaaaatat taacgtttac aatttcgcct gatgcggtat tttctcctta cgcatctgtg 2460 cggtatttca caccgcatac gcggatctgc gcagcaccat ggcctgaaat aacctctgaa 2520 agaggaactt ggttaggtac cttctgaggc ggaaagaacc agctgtggaa tgtgtgtcag 2580 ttagggtgtg gaaagtcccc aggctcccca gcaggcagaa gtatgcaaag catgcatctc 2640 aattagtcag caaccaggtg tggaaagtcc ccaggctccc cagcaggcag aagtatgcaa 2700 agcatgcatc tcaattagtc agcaaccata gtcccgcccc taactccgcc catcccgccc 2760 ctaactccgc ccagttccgc ccattctccg ccccatggct gactaatttt ttttatttat 2820 gcagaggccg aggccgcctc ggcctctgag ctattccaga agtagtgagg aggctttttt 2880 ggaggcctag gcttttgcaa aaagcttgat tcttctgaca caacagtctc gaacttaagg 2940 ctagagccac catgattgaa caagatggat tgcacgcagg ttctccggcc gcttgggtgg 3000 agaggctatt cggctatgac tgggcacaac agacaatcgg ctgctctgat gccgccgtgt 3060 tccggctgtc agcgcagggg cgcccggttc tttttgtcaa gaccgacctg tccggtgccc 3120 tgaatgaact gcaggacgag gcagcgcggc tatcgtggct ggccacgacg ggcgttcctt 3180 gcgcagctgt gctcgacgtt gtcactgaag cgggaaggga ctggctgcta ttgggcgaag 3240 tgccggggca ggatctcctg tcatctcacc ttgctcctgc cgagaaagta tccatcatgg 3300 ctgatgcaat gcggcggctg catacgcttg atccggctac ctgcccattc gaccaccaag 3360 cgaaacatcg catcgagcga gcacgtactc ggatggaagc cggtcttgtc gatcaggatg 3420 atctggacga agagcatcag gggctcgcgc cagccgaact gttcgccagg ctcaaggcgc 3480 gcatgcccga cggcgaggat ctcgtcgtga cccatggcga tgcctgcttg ccgaatatca 3540 tggtggaaaa tggccgcttt tctggattca tcgactgtgg ccggctgggt gtggcggacc 3600 gctatcagga catagcgttg gctacccgtg atattgctga agagcttggc ggcgaatggg 3660 ctgaccgctt cctcgtgctt tacggtatcg ccgctcccga ttcgcagcgc atcgccttct 3720 atcgccttct tgacgagttc ttctgagcgg gactctgggg ttcgaaatga ccgaccaagc 3780 gacgcccaac ctgccatcac gatggccgca ataaaatatc tttattttca ttacatctgt 3840 gtgttggttt tttgtgtgaa tcgatagcga taaggatccg cgtatggtgc actctcagta 3900 caatctgctc tgatgccgca tagttaagcc agccccgaca cccgccaaca cccgctgacg 3960 cgccctgacg ggcttgtctg ctcccggcat ccgcttacag acaagctgtg accgtctccg 4020 ggagctgcat gtgtcagagg ttttcaccgt catcaccgaa acgcgcgaga cgaaagggcc 4080 tcgtgatacg cctattttta taggttaatg tcatgataat aatggtttct tagacgtcag 4140 gtggcacttt tcggggaaat gtgcgcggaa cccctatttg tttatttttc taaatacatt 4200 caaatatgta tccgctcatg agacaataac cctgataaat gcttcaataa tattgaaaaa 4260 ggaagagtat gagtattcaa catttccgtg tcgcccttat tccctttttt gcggcatttt 4320 gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt aaaagatgct gaagatcagt 4380 tgggtgcacg agtgggttac atcgaactgg atctcaacag cggtaagatc cttgagagtt 4440 ttcgccccga agaacgtttt ccaatgatga gcacttttaa agttctgcta tgtggcgcgg 4500 tattatcccg tattgacgcc gggcaagagc aactcggtcg ccgcatacac tattctcaga 4560 atgacttggt tgagtactca ccagtcacag aaaagcatct tacggatggc atgacagtaa 4620 gagaattatg cagtgctgcc ataaccatga gtgataacac tgcggccaac ttacttctga 4680 caacgatcgg aggaccgaag gagctaaccg cttttttgca caacatgggg gatcatgtaa 4740 ctcgccttga tcgttgggaa ccggagctga atgaagccat accaaacgac gagcgtgaca 4800 ccacgatgcc tgtagcaatg gcaacaacgt tgcgcaaact attaactggc gaactactta 4860 ctctagcttc ccggcaacaa ttaatagact ggatggaggc ggataaagtt gcaggaccac 4920 ttctgcgctc ggcccttccg gctggctggt ttattgctga taaatctgga gccggtgagc 4980 gtgggtctcg cggtatcatt gcagcactgg ggccagatgg taagccctcc cgtatcgtag 5040 ttatctacac gacggggagt caggcaacta tggatgaacg aaatagacag atcgctgaga 5100 taggtgcctc actgattaag cattggtaac tgtcagacca agtttactca tatatacttt 5160 agattgattt aaaacttcat ttttaattta aaaggatcta ggtgaagatc ctttttgata 5220 atctcatgac caaaatccct taacgtgagt tttcgttcca ctgagcgtca gaccccgtag 5280 aaaagatcaa aggatcttct tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa 5340 caaaaaaacc accgctacca gcggtggttt gtttgccgga tcaagagcta ccaactcttt 5400 ttccgaaggt aactggcttc agcagagcgc agataccaaa tactgtcctt ctagtgtagc 5460 cgtagttagg ccaccacttc aagaactctg tagcaccgcc tacatacctc gctctgctaa 5520 tcctgttacc agtggctgct gccagtggcg ataagtcgtg tcttaccggg ttggactcaa 5580 gacgatagtt accggataag gcgcagcggt cgggctgaac ggggggttcg tgcacacagc 5640 ccagcttgga gcgaacgacc tacaccgaac tgagatacct acagcgtgag ctatgagaaa 5700 gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa 5760 caggagagcg cacgagggag cttccagggg gaaacgcctg gtatctttat agtcctgtcg 5820 ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc 5880 tatggaaaaa cgccagcaac gcggcctttt tacggttcct ggccttttgc tggccttttg 5940 ctcacatggc tcgacagatc t 5961 16 42 DNA Artificial Sequence Description of Artificial Sequence Primer 16 ggctagagaa ttccaggtaa gatgggcgat cccctcacct gg 42 17 22 DNA Artificial Sequence Description of Artificial Sequence Primer 17 ttgggtactc ctcgctaggt tc 22 18 21 DNA Artificial Sequence Description of Artificial Sequence Sequence flanking the EIAV gag/pol ORF 18 tctagagaat tcgccaccat g 21 19 18 RNA Artificial Sequence Description of Artificial Sequence Sequence flanking the EIAV gag/pol ORF 19 ugaacccggg gcggccgc 18 20 42 DNA Artificial Sequence Description of Artificial Sequence Primer 20 cagtacccgc gggccaccat gtttgactgt atggatgttc tg 42 21 37 DNA Artificial Sequence Description of Artificial Sequence Primer 21 cagtacctgc agatcattgc acgagtggtg actgact 37 22 38 DNA Artificial Sequence Description of Artificial Sequence Primer 22 caggttattc tagagtcgac gctctcatta cttgtaac 38 23 41 DNA Artificial Sequence Description of Artificial Sequence Primer 23 cgaatgcgtt ctagagtcga ccatgttcac cagggatttt g 41 24 23 DNA Artificial Sequence Description of Artificial Sequence Primer 24 cacctagcag gcgtgaccgg tgg 23 25 37 DNA Artificial Sequence Description of Artificial Sequence Primer 25 cctaccaatt gtataaaacc cctcataaaa accccac 37 26 24 DNA Artificial Sequence Description of Artificial Sequence Primer 26 cacaggtcaa acctcctagg aatg 24 27 21 DNA Artificial Sequence Description of Artificial Sequence Primer 27 tcctgctcaa cttcctgtcg a 21 28 19 DNA Artificial Sequence Description of Artificial Sequence Probe 28 cgagacgcta ccatggcta 19 29 29 DNA Artificial Sequence Description of Artificial Sequence Primer 29 accagtagtt aatttctgag acccttgta 29 30 21 DNA Artificial Sequence Description of Artificial Sequence Primer 30 attgggagac cctttgacat t 21 31 30 DNA Artificial Sequence Description of Artificial Sequence Probe 31 caccttctct aacttcttga gcgccttgct 30 32 42 DNA Artificial Sequence Description of Artificial Sequence Primer 32 actgccgcgg gccaccatgt ttgactgtat ggatgttctg tc 42 33 66 DNA Artificial Sequence Description of Artificial Sequence Primer 33 actgccgcgg gccaccatgg actacaagga cgacgatgac aagtttgact gtatggatgt 60 tctgtc 66 34 29 DNA Artificial Sequence Description of Artificial Sequence Primer 34 actggcggcc gctcactgca gcagtggtg 29 35 38 DNA Artificial Sequence Description of Artificial Sequence Primer 35 caggttattc tagagtcgac gctctcatta cttgtaac 38 36 41 DNA Artificial Sequence Description of Artificial Sequence Primer 36 cgaatgcgtt ctagagtcga ccatgttcac cagggatttt g 41 37 19 DNA Artificial Sequence Description of Artificial Sequence Primer 37 tgtagctctc tgagcactc 19 38 19 DNA Artificial Sequence Description of Artificial Sequence Primer 38 tacgccttct tctttcccc 19 39 21 DNA Artificial Sequence Description of Artificial Sequence Primer 39 cttttataac cagaaccggg c 21 40 21 DNA Artificial Sequence Description of Artificial Sequence Primer 40 caagtagaag ccaggaaagt c 21 41 21 DNA Artificial Sequence Description of Artificial Sequence Primer 41 ctaagaagac ccacacttct g 21 42 19 DNA Artificial Sequence Description of Artificial Sequence Primer 42 aagtgaggtg aaaactggg 19 43 18 DNA Artificial Sequence Description of Artificial Sequence Primer 43 ttcacagcct ggcataac 18 44 19 DNA Artificial Sequence Description of Artificial Sequence Primer 44 gagaaggaag tgagccatc 19 45 21 DNA Artificial Sequence Description of Artificial Sequence Primer 45 tctctgtgca ttcctgcttt g 21 46 20 DNA Artificial Sequence Description of Artificial Sequence Primer 46 tggacacatg actcactacc 20 47 20 DNA Artificial Sequence Description of Artificial Sequence Primer 47 atgttctgtc agtgagtccc 20 48 19 DNA Artificial Sequence Description of Artificial Sequence Primer 48 gcatgtcaga ggacaactg 19 49 18 DNA Artificial Sequence Description of Artificial Sequence Primer 49 agcctggaaa atgccatc 18 50 22 DNA Artificial Sequence Description of Artificial Sequence Primer 50 ttacagcttc cttggacatg cc 22 51 20 DNA Artificial Sequence Description of Artificial Sequence Primer 51 agatgctgag ccctagcttc 20 52 20 DNA Artificial Sequence Description of Artificial Sequence Primer 52 ttactacgca gagccactgg 20 53 19 DNA Artificial Sequence Description of Artificial Sequence Primer 53 ggaagatgga agagggaac 19 54 19 DNA Artificial Sequence Description of Artificial Sequence Primer 54 caaatttact gggggttgg 19 55 20 DNA Artificial Sequence Description of Artificial Sequence Primer 55 ggctggattt tggattgaag 20 56 22 DNA Artificial Sequence Description of Artificial Sequence Primer 56 ttctgtcctc tcactacctt gg 22 57 20 DNA Artificial Sequence Description of Artificial Sequence Primer 57 cattaccgcg agtcactaac 20 58 20 DNA Artificial Sequence Description of Artificial Sequence Primer 58 cgtagacaaa atggtgaagg 20 59 20 DNA Artificial Sequence Description of Artificial Sequence Primer 59 gactccacga catactcagc 20 60 18 DNA Artificial Sequence Description of Artificial Sequence Primer 60 gcttcttcat tgacccac 18 61 20 DNA Artificial Sequence Description of Artificial Sequence Primer 61 cttcaccgtc aggtctttac 20 62 20 DNA Artificial Sequence Description of Artificial Sequence Primer 62 gcaaggaccg gaatgagaac 20 63 21 DNA Artificial Sequence Description of Artificial Sequence Primer 63 tctaggggca gctcagaaaa g 21 64 21 DNA Artificial Sequence Description of Artificial Sequence Primer 64 agaataaagg ggtagtgaag g 21 65 18 DNA Artificial Sequence Description of Artificial Sequence Primer 65 catcaatgtc cccacttg 18 66 20 DNA Artificial Sequence Description of Artificial Sequence Primer 66 tgccagtagt agccacgaag 20 67 19 DNA Artificial Sequence Description of Artificial Sequence Primer 67 tgagcagttc attccaccc 19 68 10998 DNA Artificial Sequence Description of Artificial Sequence pONY8.0Z 68 agatcttgaa taataaaatg tgtgtttgtc cgaaatacgc gttttgagat ttctgtcgcc 60 gactaaattc atgtcgcgcg atagtggtgt ttatcgccga tagagatggc gatattggaa 120 aaattgatat ttgaaaatat ggcatattga aaatgtcgcc gatgtgagtt tctgtgtaac 180 tgatatcgcc atttttccaa aagtgatttt tgggcatacg cgatatctgg cgatagcgct 240 tatatcgttt acgggggatg gcgatagacg actttggtga cttgggcgat tctgtgtgtc 300 gcaaatatcg cagtttcgat ataggtgaca gacgatatga ggctatatcg ccgatagagg 360 cgacatcaag ctggcacatg gccaatgcat atcgatctat acattgaatc aatattggcc 420 attagccata ttattcattg gttatatagc ataaatcaat attggctatt ggccattgca 480 tacgttgtat ccatatcgta atatgtacat ttatattggc tcatgtccaa cattaccgcc 540 atgttgacat tgattattga ctagttatta atagtaatca attacggggt cattagttca 600 tagcccatat atggagttcc gcgttacata acttacggta aatggcccgc ctggctgacc 660 gcccaacgac ccccgcccat tgacgtcaat aatgacgtat gttcccatag taacgccaat 720 agggactttc cattgacgtc aatgggtgga gtatttacgg taaactgccc acttggcagt 780 acatcaagtg tatcatatgc caagtccgcc ccctattgac gtcaatgacg gtaaatggcc 840 cgcctggcat tatgcccagt acatgacctt acgggacttt cctacttggc agtacatcta 900 cgtattagtc atcgctatta ccatggtgat gcggttttgg cagtacacca atgggcgtgg 960 atagcggttt gactcacggg gatttccaag tctccacccc attgacgtca atgggagttt 1020 gttttggcac caaaatcaac gggactttcc aaaatgtcgt aacaactgcg atcgcccgcc 1080 ccgttgacgc aaatgggcgg taggcgtgta cggtgggagg tctatataag cagagctcgt 1140 ttagtgaacc gggcactcag attctgcggt ctgagtccct tctctgctgg gctgaaaagg 1200 cctttgtaat aaatataatt ctctactcag tccctgtctc tagtttgtct gttcgagatc 1260 ctacagttgg cgcccgaaca gggacctgag aggggcgcag accctacctg ttgaacctgg 1320 ctgatcgtag gatccccggg acagcagagg agaacttaca gaagtcttct ggaggtgttc 1380 ctggccagaa cacaggagga caggtaagat tgggagaccc tttgacattg gagcaaggcg 1440 ctcaagaagt tagagaaggt gacggtacaa gggtctcaga aattaactac tggtaactgt 1500 aattgggcgc taagtctagt agacttattt catgatacca actttgtaaa agaaaaggac 1560 tggcagctga gggatgtcat tccattgctg gaagatgtaa ctcagacgct gtcaggacaa 1620 gaaagagagg cctttgaaag aacatggtgg gcaatttctg ctgtaaagat gggcctccag 1680 attaataatg tagtagatgg aaaggcatca ttccagctcc taagagcgaa atatgaaaag 1740 aagactgcta ataaaaagca gtctgagccc tctgaagaat atctctagaa ctagtggatc 1800 ccccgggctg caggagtggg gaggcacgat ggccgctttg gtcgaggcgg atccggccat 1860 tagccatatt attcattggt tatatagcat aaatcaatat tggctattgg ccattgcata 1920 cgttgtatcc atatcataat atgtacattt atattggctc atgtccaaca ttaccgccat 1980 gttgacattg attattgact agttattaat agtaatcaat tacggggtca ttagttcata 2040 gcccatatat ggagttccgc gttacataac ttacggtaaa tggcccgcct ggctgaccgc 2100 ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt tcccatagta acgccaatag 2160 ggactttcca ttgacgtcaa tgggtggagt atttacggta aactgcccac ttggcagtac 2220 atcaagtgta tcatatgcca agtacgcccc ctattgacgt caatgacggt aaatggcccg 2280 cctggcatta tgcccagtac atgaccttat gggactttcc tacttggcag tacatctacg 2340 tattagtcat cgctattacc atggtgatgc ggttttggca gtacatcaat gggcgtggat 2400 agcggtttga ctcacgggga tttccaagtc tccaccccat tgacgtcaat gggagtttgt 2460 tttggcacca aaatcaacgg gactttccaa aatgtcgtaa caactccgcc ccattgacgc 2520 aaatgggcgg taggcatgta cggtgggagg tctatataag cagagctcgt ttagtgaacc 2580 gtcagatcgc ctggagacgc catccacgct gttttgacct ccatagaaga caccgggacc 2640 gatccagcct ccgcggcccc aagcttcagc tgctcgagga tctgcggatc cggggaattc 2700 cccagtctca ggatccacca tgggggatcc cgtcgtttta caacgtcgtg actgggaaaa 2760 ccctggcgtt acccaactta atcgccttgc agcacatccc cctttcgcca gctggcgtaa 2820 tagcgaagag gcccgcaccg atcgcccttc ccaacagttg cgcagcctga atggcgaatg 2880 gcgctttgcc tggtttccgg caccagaagc ggtgccggaa agctggctgg agtgcgatct 2940 tcctgaggcc gatactgtcg tcgtcccctc aaactggcag atgcacggtt acgatgcgcc 3000 catctacacc aacgtaacct atcccattac ggtcaatccg ccgtttgttc ccacggagaa 3060 tccgacgggt tgttactcgc tcacatttaa tgttgatgaa agctggctac aggaaggcca 3120 gacgcgaatt atttttgatg gcgttaactc ggcgtttcat ctgtggtgca acgggcgctg 3180 ggtcggttac ggccaggaca gtcgtttgcc gtctgaattt gacctgagcg catttttacg 3240 cgccggagaa aaccgcctcg cggtgatggt gctgcgttgg agtgacggca gttatctgga 3300 agatcaggat atgtggcgga tgagcggcat tttccgtgac gtctcgttgc tgcataaacc 3360 gactacacaa atcagcgatt tccatgttgc cactcgcttt aatgatgatt tcagccgcgc 3420 tgtactggag gctgaagttc agatgtgcgg cgagttgcgt gactacctac gggtaacagt 3480 ttctttatgg cagggtgaaa cgcaggtcgc cagcggcacc gcgcctttcg gcggtgaaat 3540 tatcgatgag cgtggtggtt atgccgatcg cgtcacacta cgtctgaacg tcgaaaaccc 3600 gaaactgtgg agcgccgaaa tcccgaatct ctatcgtgcg gtggttgaac tgcacaccgc 3660 cgacggcacg ctgattgaag cagaagcctg cgatgtcggt ttccgcgagg tgcggattga 3720 aaatggtctg ctgctgctga acggcaagcc gttgctgatt cgaggcgtta accgtcacga 3780 gcatcatcct ctgcatggtc aggtcatgga tgagcagacg atggtgcagg atatcctgct 3840 gatgaagcag aacaacttta acgccgtgcg ctgttcgcat tatccgaacc atccgctgtg 3900 gtacacgctg tgcgaccgct acggcctgta tgtggtggat gaagccaata ttgaaaccca 3960 cggcatggtg ccaatgaatc gtctgaccga tgatccgcgc tggctaccgg cgatgagcga 4020 acgcgtaacg cgaatggtgc agcgcgatcg taatcacccg agtgtgatca tctggtcgct 4080 ggggaatgaa tcaggccacg gcgctaatca cgacgcgctg tatcgctgga tcaaatctgt 4140 cgatccttcc cgcccggtgc agtatgaagg cggcggagcc gacaccacgg ccaccgatat 4200 tatttgcccg atgtacgcgc gcgtggatga agaccagccc ttcccggctg tgccgaaatg 4260 gtccatcaaa aaatggcttt cgctacctgg agagacgcgc ccgctgatcc tttgcgaata 4320 cgcccacgcg atgggtaaca gtcttggcgg tttcgctaaa tactggcagg cgtttcgtca 4380 gtatccccgt ttacagggcg gcttcgtctg ggactgggtg gatcagtcgc tgattaaata 4440 tgatgaaaac ggcaacccgt ggtcggctta cggcggtgat tttggcgata cgccgaacga 4500 tcgccagttc tgtatgaacg gtctggtctt tgccgaccgc acgccgcatc cagcgctgac 4560 ggaagcaaaa caccagcagc agtttttcca gttccgttta tccgggcaaa ccatcgaagt 4620 gaccagcgaa tacctgttcc gtcatagcga taacgagctc ctgcactgga tggtggcgct 4680 ggatggtaag ccgctggcaa gcggtgaagt gcctctggat gtcgctccac aaggtaaaca 4740 gttgattgaa ctgcctgaac taccgcagcc ggagagcgcc gggcaactct ggctcacagt 4800 acgcgtagtg caaccgaacg cgaccgcatg gtcagaagcc gggcacatca gcgcctggca 4860 gcagtggcgt ctggcggaaa acctcagtgt gacgctcccc gccgcgtccc acgccatccc 4920 gcatctgacc accagcgaaa tggatttttg catcgagctg ggtaataagc gttggcaatt 4980 taaccgccag tcaggctttc tttcacagat gtggattggc gataaaaaac aactgctgac 5040 gccgctgcgc gatcagttca cccgtgcacc gctggataac gacattggcg taagtgaagc 5100 gacccgcatt gaccctaacg cctgggtcga acgctggaag gcggcgggcc attaccaggc 5160 cgaagcagcg ttgttgcagt gcacggcaga tacacttgct gatgcggtgc tgattacgac 5220 cgctcacgcg tggcagcatc aggggaaaac cttatttatc agccggaaaa cctaccggat 5280 tgatggtagt ggtcaaatgg cgattaccgt tgatgttgaa gtggcgagcg atacaccgca 5340 tccggcgcgg attggcctga actgccagct ggcgcaggta gcagagcggg taaactggct 5400 cggattaggg ccgcaagaaa actatcccga ccgccttact gccgcctgtt ttgaccgctg 5460 ggatctgcca ttgtcagaca tgtatacccc gtacgtcttc ccgagcgaaa acggtctgcg 5520 ctgcgggacg cgcgaattga attatggccc acaccagtgg cgcggcgact tccagttcaa 5580 catcagccgc tacagtcaac agcaactgat ggaaaccagc catcgccatc tgctgcacgc 5640 ggaagaaggc acatggctga atatcgacgg tttccatatg gggattggtg gcgacgactc 5700 ctggagcccg tcagtatcgg cggaattcca gctgagcgcc ggtcgctacc attaccagtt 5760 ggtctggtgt caaaaataat aataaccggg caggggggat ccgcagatcc ggctgtggaa 5820 tgtgtgtcag ttagggtgtg gaaagtcccc aggctcccca gcaggcagaa gtatgcaaag 5880 catgcctgca ggaattcgat atcaagctta tcgataccgt cgacctcgag ggggggcccg 5940 gtacccagct tttgttccct ttagtgaggg ttaattgcgc gggaagtatt tatcactaat 6000 caagcacaag taatacatga gaaactttta ctacagcaag cacaatcctc caaaaaattt 6060 tgtttttaca aaatccctgg tgaacatgat tggaagggac ctactagggt gctgtggaag 6120 ggtgatggtg cagtagtagt taatgatgaa ggaaagggaa taattgctgt accattaacc 6180 aggactaagt tactaataaa accaaattga gtattgttgc aggaagcaag acccaactac 6240 cattgtcagc tgtgtttcct gacctcaata tttgttataa ggtttgatat gaatcccagg 6300 gggaatctca acccctatta cccaacagtc agaaaaatct aagtgtgagg agaacacaat 6360 gtttcaacct tattgttata ataatgacag taagaacagc atggcagaat cgaaggaagc 6420 aagagaccaa gaatgaacct gaaagaagaa tctaaagaag aaaaaagaag aaatgactgg 6480 tggaaaatag gtatgtttct gttatgctta gcaggaacta ctggaggaat actttggtgg 6540 tatgaaggac tcccacagca acattatata gggttggtgg cgataggggg aagattaaac 6600 ggatctggcc aatcaaatgc tatagaatgc tggggttcct tcccggggtg tagaccattt 6660 caaaattact tcagttatga gaccaataga agcatgcata tggataataa tactgctaca 6720 ttattagaag ctttaaccaa tataactgct ctataaataa caaaacagaa ttagaaacat 6780 ggaagttagt aaagacttct ggcataactc ctttacctat ttcttctgaa gctaacactg 6840 gactaattag acataagaga gattttggta taagtgcaat agtggcagct attgtagccg 6900 ctactgctat tgctgctagc gctactatgt cttatgttgc tctaactgag gttaacaaaa 6960 taatggaagt acaaaatcat acttttgagg tagaaaatag tactctaaat ggtatggatt 7020 taatagaacg acaaataaag atattatatg ctatgattct tcaaacacat gcagatgttc 7080 aactgttaaa ggaaagacaa caggtagagg agacatttaa tttaattgga tgtatagaaa 7140 gaacacatgt attttgtcat actggtcatc cctggaatat gtcatgggga catttaaatg 7200 agtcaacaca atgggatgac tgggtaagca aaatggaaga tttaaatcaa gagatactaa 7260 ctacacttca tggagccagg aacaatttgg cacaatccat gataacattc aatacaccag 7320 atagtatagc tcaatttgga aaagaccttt ggagtcatat tggaaattgg attcctggat 7380 tgggagcttc cattataaaa tatatagtga tgtttttgct tatttatttg ttactaacct 7440 cttcgcctaa gatcctcagg gccctctgga aggtgaccag tggtgcaggg tcctccggca 7500 gtcgttacct gaagaaaaaa ttccatcaca aacatgcatc gcgagaagac acctgggacc 7560 aggcccaaca caacatacac ctagcaggcg tgaccggtgg atcaggggac aaatactaca 7620 agcagaagta ctccaggaac gactggaatg gagaatcaga ggagtacaac aggcggccaa 7680 agagctgggt gaagtcaatc gaggcatttg gagagagcta tatttccgag aagaccaaag 7740 gggagatttc tcagcctggg gcggctatca acgagcacaa gaacggctct ggggggaaca 7800 atcctcacca agggtcctta gacctggaga ttcgaagcga aggaggaaac atttatgact 7860 gttgcattaa agcccaagaa ggaactctcg ctatcccttg ctgtggattt cccttatggc 7920 tattttgggg actagtaatt atagtaggac gcatagcagg ctatggatta cgtggactcg 7980 ctgttataat aaggatttgt attagaggct taaatttgat atttgaaata atcagaaaaa 8040 tgcttgatta tattggaaga gctttaaatc ctggcacatc tcatgtatca atgcctcagt 8100 atgtttagaa aaacaagggg ggaactgtgg ggtttttatg aggggtttta taaatgatta 8160 taagagtaaa aagaaagttg ctgatgctct cataaccttg tataacccaa aggactagct 8220 catgttgcta ggcaactaaa ccgcaataac cgcatttgtg acgcgagttc cccattggtg 8280 acgcgttaac ttcctgtttt tacagtatat aagtgcttgt attctgacaa ttgggcactc 8340 agattctgcg gtctgagtcc cttctctgct gggctgaaaa ggcctttgta ataaatataa 8400 ttctctactc agtccctgtc tctagtttgt ctgttcgaga tcctacagag ctcatgcctt 8460 ggcgtaatca tggtcatagc tgtttcctgt gtgaaattgt tatccgctca caattccaca 8520 caacatacga gccggaagca taaagtgtaa agcctggggt gcctaatgag tgagctaact 8580 cacattaatt gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt cgtgccagct 8640 gcattaatga atcggccaac gcgcggggag aggcggtttg cgtattgggc gctcttccgc 8700 ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca 8760 ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg 8820 agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca 8880 taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa 8940 cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc 9000 tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc 9060 gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct 9120 gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg 9180 tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag 9240 gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta 9300 cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg 9360 aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt 9420 tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt 9480 ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag 9540 attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat 9600 ctaaagtata tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc 9660 tatctcagcg atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat 9720 aactacgata cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc 9780 acgctcaccg gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag 9840 aagtggtcct gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag 9900 agtaagtagt tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt 9960 ggtgtcacgc tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg 10020 agttacatga tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt 10080 tgtcagaagt aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc 10140 tcttactgtc atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc 10200 attctgagaa tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa 10260 taccgcgcca catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg 10320 aaaactctca aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc 10380 caactgatct tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag 10440 gcaaaatgcc gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt 10500 cctttttcaa tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt 10560 tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc 10620 acctaaattg taagcgttaa tattttgtta aaattcgcgt taaatttttg ttaaatcagc 10680 tcatttttta accaataggc cgaaatcggc aaaatccctt ataaatcaaa agaatagacc 10740 gagatagggt tgagtgttgt tccagtttgg aacaagagtc cactattaaa gaacgtggac 10800 tccaacgtca aagggcgaaa aaccgtctat cagggcgatg gcccactacg tgaaccatca 10860 ccctaatcaa gttttttggg gtcgaggtgc cgtaaagcac taaatcggaa ccctaaaggg 10920 agcccccgat ttagagcttg acggggaaag ccaacctggc ttatcgaaat taatacgact 10980 cactataggg agaccggc 10998 69 12481 DNA Artificial Sequence Description of Artificial Sequence pONY3.1 69 agatcttcaa tattggccat tagccatatt attcattggt tatatagcat aaatcaatat 60 tggctattgg ccattgcata cgttgtatct atatcataat atgtacattt atattggctc 120 atgtccaata tgaccgccat gttggcattg attattgact agttattaat agtaatcaat 180 tacggggtca ttagttcata gcccatatat ggagttccgc gttacataac ttacggtaaa 240 tggcccgcct ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt 300 tcccatagta acgccaatag ggactttcca ttgacgtcaa tgggtggagt atttacggta 360 aactgcccac ttggcagtac atcaagtgta tcatatgcca agtccgcccc ctattgacgt 420 caatgacggt aaatggcccg cctggcatta tgcccagtac atgaccttac gggactttcc 480 tacttggcag tacatctacg tattagtcat cgctattacc atggtgatgc ggttttggca 540 gtacaccaat gggcgtggat agcggtttga ctcacgggga tttccaagtc tccaccccat 600 tgacgtcaat gggagtttgt tttggcacca aaatcaacgg gactttccaa aatgtcgtaa 660 caactgcgat cgcccgcccc gttgacgcaa atgggcggta ggcgtgtacg gtgggaggtc 720 tatataagca gagctcgttt agtgaaccgt cagatcacta gaagctttat tgcggtagtt 780 tatcacagtt aaattgctaa cgcagtcagt gcttctgaca caacagtctc gaacttaagc 840 tgcagtgact ctcttaaggt agccttgcag aagttggtcg tgaggcactg ggcaggtaag 900 tatcaaggtt acaagacagg tttaaggaga ccaatagaaa ctgggcttgt cgagacagag 960 aagactcttg cgtttctgat aggcacctat tggtcttact gacatccact ttgcctttct 1020 ctccacaggt gtccactccc agttcaatta cagctcttaa ggctagagta cttaatacga 1080 ctcactatag gctagcctcg aggtcgacgg tatcgcccga acagggacct gagaggggcg 1140 cagaccctac ctgttgaacc tggctgatcg taggatcccc gggacagcag aggagaactt 1200 acagaagtct tctggaggtg ttcctggcca gaacacagga ggacaggtaa gatgggagac 1260 cctttgacat ggagcaaggc gctcaagaag ttagagaagg tgacggtaca agggtctcag 1320 aaattaacta ctggtaactg taattgggcg ctaagtctag tagacttatt tcatgatacc 1380 aactttgtaa aagaaaagga ctggcagctg agggatgtca ttccattgct ggaagatgta 1440 actcagacgc tgtcaggaca agaaagagag gcctttgaaa gaacatggtg ggcaatttct 1500 gctgtaaaga tgggcctcca gattaataat gtagtagatg gaaaggcatc attccagctc 1560 ctaagagcga aatatgaaaa gaagactgct aataaaaagc agtctgagcc ctctgaagaa 1620 tatccaatca tgatagatgg ggctggaaac agaaatttta gacctctaac acctagagga 1680 tatactactt gggtgaatac catacagaca aatggtctat taaatgaagc tagtcaaaac 1740 ttatttggga tattatcagt agactgtact tctgaagaaa tgaatgcatt tttggatgtg 1800 gtacctggcc aggcaggaca aaagcagata ttacttgatg caattgataa gatagcagat 1860 gattgggata atagacatcc attaccgaat gctccactgg tggcaccacc acaagggcct 1920 attcccatga cagcaaggtt tattagaggt ttaggagtac ctagagaaag acagatggag 1980 cctgcttttg atcagtttag gcagacatat agacaatgga taatagaagc catgtcagaa 2040 ggcatcaaag tgatgattgg aaaacctaaa gctcaaaata ttaggcaagg agctaaggaa 2100 ccttacccag aatttgtaga cagactatta tcccaaataa aaagtgaggg acatccacaa 2160 gagatttcaa aattcttgac tgatacactg actattcaga acgcaaatga ggaatgtaga 2220 aatgctatga gacatttaag accagaggat acattagaag agaaaatgta tgcttgcaga 2280 gacattggaa ctacaaaaca aaagatgatg ttattggcaa aagcacttca gactggtctt 2340 gcgggcccat ttaaaggtgg agccttgaaa ggagggccac taaaggcagc acaaacatgt 2400 tataactgtg ggaagccagg acatttatct agtcaatgta gagcacctaa agtctgtttt 2460 aaatgtaaac agcctggaca tttctcaaag caatgcagaa gtgttccaaa aaacgggaag 2520 caaggggctc aagggaggcc ccagaaacaa actttcccga tacaacagaa gagtcagcac 2580 aacaaatctg ttgtacaaga gactcctcag actcaaaatc tgtacccaga tctgagcgaa 2640 ataaaaaagg aatacaatgt caaggagaag gatcaagtag aggatctcaa cctggacagt 2700 ttgtgggagt aacatataat ctagagaaaa ggcctactac aatagtatta attaatgata 2760 ctcccttaaa tgtactgtta gacacaggag cagatacttc agtgttgact actgcacatt 2820 ataataggtt aaaatataga gggagaaaat atcaagggac gggaataata ggagtgggag 2880 gaaatgtgga aacattttct acgcctgtga ctataaagaa aaagggtaga cacattaaga 2940 caagaatgct agtggcagat attccagtga ctattttggg acgagatatt cttcaggact 3000 taggtgcaaa attggttttg gcacagctct ccaaggaaat aaaatttaga aaaatagagt 3060 taaaagaggg cacaatgggg ccaaaaattc ctcaatggcc actcactaag gagaaactag 3120 aaggggccaa agagatagtc caaagactat tgtcagaggg aaaaatatca gaagctagtg 3180 acaataatcc ttataattca cccatatttg taataaaaaa gaggtctggc aaatggaggt 3240 tattacaaga tctgagagaa ttaaacaaaa cagtacaagt aggaacggaa atatccagag 3300 gattgcctca cccgggagga ttaattaaat gtaaacacat gactgtatta gatattggag 3360 atgcatattt cactataccc ttagatccag agtttagacc atatacagct ttcactattc 3420 cctccattaa tcatcaagaa ccagataaaa gatatgtgtg gaaatgttta ccacaaggat 3480 tcgtgttgag cccatatata tatcagaaaa cattacagga aattttacaa ccttttaggg 3540 aaagatatcc tgaagtacaa ttgtatcaat atatggatga tttgttcatg ggaagtaatg 3600 gttctaaaaa acaacacaaa gagttaatca tagaattaag ggcgatctta ctggaaaagg 3660 gttttgagac accagatgat aaattacaag aagtgccacc ttatagctgg ctaggttatc 3720 aactttgtcc tgaaaattgg aaagtacaaa aaatgcaatt agacatggta aagaatccaa 3780 cccttaatga tgtgcaaaaa ttaatgggga atataacatg gatgagctca gggatcccag 3840 ggttgacagt aaaacacatt gcagctacta ctaagggatg tttagagttg aatcaaaaag 3900 taatttggac ggaagaggca caaaaagagt tagaagaaaa taatgagaag attaaaaatg 3960 ctcaagggtt acaatattat aatccagaag aagaaatgtt atgtgaggtt gaaattacaa 4020 aaaattatga ggcaacttat gttataaaac aatcacaagg aatcctatgg gcaggtaaaa 4080 agattatgaa ggctaataag ggatggtcaa cagtaaaaaa tttaatgtta ttgttgcaac 4140 atgtggcaac agaaagtatt actagagtag gaaaatgtcc aacgtttaag gtaccattta 4200 ccaaagagca agtaatgtgg gaaatgcaaa aaggatggta ttattcttgg ctcccagaaa 4260 tagtatatac acatcaagta gttcatgatg attggagaat gaaattggta gaagaaccta 4320 catcaggaat aacaatatac actgatgggg gaaaacaaaa tggagaagga atagcagctt 4380 atgtgaccag taatgggaga actaaacaga aaaggttagg acctgtcact catcaagttg 4440 ctgaaagaat ggcaatacaa atggcattag aggataccag agataaacaa gtaaatatag 4500 taactgatag ttattattgt tggaaaaata ttacagaagg attaggttta gaaggaccac 4560 aaagtccttg gtggcctata atacaaaata tacgagaaaa agagatagtt tattttgctt 4620 gggtacctgg tcacaaaggg atatatggta atcaattggc agatgaagcc gcaaaaataa 4680 aagaagaaat catgctagca taccaaggca cacaaattaa agagaaaaga gatgaagatg 4740 cagggtttga cttatgtgtt ccttatgaca tcatgatacc tgtatctgac acaaaaatca 4800 tacccacaga tgtaaaaatt caagttcctc ctaatagctt tggatgggtc actgggaaat 4860 catcaatggc aaaacagggg ttattaatta atggaggaat aattgatgaa ggatatacag 4920 gagaaataca agtgatatgt actaatattg gaaaaagtaa tattaaatta atagagggac 4980 aaaaatttgc acaattaatt atactacagc atcactcaaa ttccagacag ccttgggatg 5040 aaaataaaat atctcagaga ggggataaag gatttggaag tacaggagta ttctgggtag 5100 aaaatattca ggaagcacaa gatgaacatg agaattggca tacatcacca aagatattgg 5160 caagaaatta taagatacca ttgactgtag caaaacagat aactcaagaa tgtcctcatt 5220 gcactaagca aggatcagga cctgcaggtt gtgtcatgag atctcctaat cattggcagg 5280 cagattgcac acatttggac aataagataa tattgacttt tgtagagtca aattcaggat 5340 acatacatgc tacattattg tcaaaagaaa atgcattatg tacttcattg gctattttag 5400 aatgggcaag attgttttca ccaaagtcct tacacacaga taacggcact aattttgtgg 5460 cagaaccagt tgtaaatttg ttgaagttcc taaagatagc acataccaca ggaataccat 5520 atcatccaga aagtcagggt attgtagaaa gggcaaatag gaccttgaaa gagaagattc 5580 aaagtcatag agacaacact caaacactgg aggcagcttt acaacttgct ctcattactt 5640 gtaacaaagg gagggaaagt atgggaggac agacaccatg ggaagtattt atcactaatc 5700 aagcacaagt aatacatgag aaacttttac tacagcaagc acaatcctcc aaaaaatttt 5760 gtttttacaa aatccctggt gaacatgatt ggaagggacc tactagggtg ctgtggaagg 5820 gtgatggtgc agtagtagtt aatgatgaag gaaagggaat aattgctgta ccattaacca 5880 ggactaagtt actaataaaa ccaaattgag tattgttgca ggaagcaaga cccaactacc 5940 attgtcagct gtgtttcctg aggtctctag gaattgatta cctcgatgct tcattaagga 6000 agaagaataa acaaagactg aaggcaatcc aacaaggaag acaacctcaa tatttgttat 6060 aaggtttgat atatgggagt atttggtaaa ggggtaacat ggtcagcatc gcattctatg 6120 ggggaatccc agggggaatc tcaaccccta ttacccaaca gtcagaaaaa tctaagtgtg 6180 aggagaacac aatgtttcaa ccttattgtt ataataatga cagtaagaac agcatggcag 6240 aatcgaagga agcaagagac caagaaatga acctgaaaga agaatctaaa gaagaaaaaa 6300 gaagaaatga ctggtggaaa ataggtatgt ttctgttatg cttagcagga actactggag 6360 gaatactttg gtggtatgaa ggactcccac agcaacatta tatagggttg gtggcgatag 6420 ggggaagatt aaacggatct ggccaatcaa atgctataga atgctggggt tccttcccgg 6480 ggtgtagacc atttcaaaat tacttcagtt atgagaccaa tagaagcatg catatggata 6540 ataatactgc tacattatta gaagctttaa ccaatataac tgctctataa ataacaaaac 6600 agaattagaa acatggaagt tagtaaagac ttctggcata actcctttac ctatttcttc 6660 tgaagctaac actggactaa ttagacataa gagagatttt ggtataagtg caatagtggc 6720 agctattgta gccgctactg ctattgctgc tagcgctact atgtcttatg ttgctctaac 6780 tgaggttaac aaaataatgg aagtacaaaa tcatactttt gaggtagaaa atagtactct 6840 aaatggtatg gatttaatag aacgacaaat aaagatatta tatgctatga ttcttcaaac 6900 acatgcagat gttcaactgt taaaggaaag acaacaggta gaggagacat ttaatttaat 6960 tggatgtata gaaagaacac atgtattttg tcatactggt catccctgga atatgtcatg 7020 gggacattta aatgagtcaa cacaatggga tgactgggta agcaaaatgg aagatttaaa 7080 tcaagagata ctaactacac ttcatggagc caggaacaat ttggcacaat ccatgataac 7140 attcaataca ccagatagta tagctcaatt tggaaaagac ctttggagtc atattggaaa 7200 ttggattcct ggattgggag cttccattat aaaatatata gtgatgtttt tgcttattta 7260 tttgttacta acctcttcgc ctaagatcct cagggccctc tggaaggtga ccagtggtgc 7320 agggtcctcc ggcagtcgtt acctgaagaa aaaattccat cacaaacatg catcgcgaga 7380 agacacctgg gaccaggccc aacacaacat acacctagca ggcgtgaccg gtggatcagg 7440 ggacaaatac tacaagcaga agtactccag gaacgactgg aatggagaat cagaggagta 7500 caacaggcgg ccaaagagct gggtgaagtc aatcgaggca tttggagaga gctatatttc 7560 cgagaagacc aaaggggaga tttctcagcc tggggcggct atcaacgagc acaagaacgg 7620 ctctgggggg aacaatcctc accaagggtc cttagacctg gagattcgaa gcgaaggagg 7680 aaacatttat gactgttgca ttaaagccca agaaggaact ctcgctatcc cttgctgtgg 7740 atttccctta tggctatttt ggggactagt aattatagta ggacgcatag caggctatgg 7800 attacgtgga ctcgctgtta taataaggat ttgtattaga ggcttaaatt tgatatttga 7860 aataatcaga aaaatgcttg attatattgg aagagcttta aatcctggca catctcatgt 7920 atcaatgcct cagtatgttt agaaaaacaa ggggggaact gtggggtttt tatgaggggt 7980 tttataaatg attataagag taaaaagaaa gttgctgatg ctctcataac cttgtataac 8040 ccaaaggact agctcatgtt gctaggcaac taaaccgcaa taaccgcatt tgtgacgcga 8100 gttccccatt ggtgacgcgt ggtacctcta gagtcgaccc gggcggccgc ttccctttag 8160 tgagggttaa tgcttcgagc agacatgata agatacattg atgagtttgg acaaaccaca 8220 actagaatgc agtgaaaaaa atgctttatt tgtgaaattt gtgatgctat tgctttattt 8280 gtaaccatta taagctgcaa taaacaagtt aacaacaaca attgcattca ttttatgttt 8340 caggttcagg gggagatgtg ggaggttttt taaagcaagt aaaacctcta caaatgtggt 8400 aaaatccgat aaggatcgat ccgggctggc gtaatagcga agaggcccgc accgatcgcc 8460 cttcccaaca gttgcgcagc ctgaatggcg aatggacgcg ccctgtagcg gcgcattaag 8520 cgcggcgggt gtggtggtta cgcgcagcgt gaccgctaca cttgccagcg ccctagcgcc 8580 cgctcctttc gctttcttcc cttcctttct cgccacgttc gccggctttc cccgtcaagc 8640 tctaaatcgg gggctccctt tagggttccg atttagagct ttacggcacc tcgaccgcaa 8700 aaaacttgat ttgggtgatg gttcacgtag tgggccatcg ccctgataga cggtttttcg 8760 ccctttgacg ttggagtcca cgttctttaa tagtggactc ttgttccaaa ctggaacaac 8820 actcaaccct atctcggtct attcttttga tttataaggg attttgccga tttcggccta 8880 ttggttaaaa aatgagctga tttaacaaat atttaacgcg aattttaaca aaatattaac 8940 gtttacaatt tcgcctgatg cggtattttc tccttacgca tctgtgcggt atttcacacc 9000 gcatacgcgg atctgcgcag caccatggcc tgaaataacc tctgaaagag gaacttggtt 9060 aggtaccttc tgaggcggaa agaaccagct gtggaatgtg tgtcagttag ggtgtggaaa 9120 gtccccaggc tccccagcag gcagaagtat gcaaagcatg catctcaatt agtcagcaac 9180 caggtgtgga aagtccccag gctccccagc aggcagaagt atgcaaagca tgcatctcaa 9240 ttagtcagca accatagtcc cgcccctaac tccgcccatc ccgcccctaa ctccgcccag 9300 ttccgcccat tctccgcccc atggctgact aatttttttt atttatgcag aggccgaggc 9360 cgcctcggcc tctgagctat tccagaagta gtgaggaggc ttttttggag gcctaggctt 9420 ttgcaaaaag cttgattctt ctgacacaac agtctcgaac ttaaggctag agccaccatg 9480 attgaacaag atggattgca cgcaggttct ccggccgctt gggtggagag gctattcggc 9540 tatgactggg cacaacagac aatcggctgc tctgatgccg ccgtgttccg gctgtcagcg 9600 caggggcgcc cggttctttt tgtcaagacc gacctgtccg gtgccctgaa tgaactgcag 9660 gacgaggcag cgcggctatc gtggctggcc acgacgggcg ttccttgcgc agctgtgctc 9720 gacgttgtca ctgaagcggg aagggactgg ctgctattgg gcgaagtgcc ggggcaggat 9780 ctcctgtcat ctcaccttgc tcctgccgag aaagtatcca tcatggctga tgcaatgcgg 9840 cggctgcata cgcttgatcc ggctacctgc ccattcgacc accaagcgaa acatcgcatc 9900 gagcgagcac gtactcggat ggaagccggt cttgtcgatc aggatgatct ggacgaagag 9960 catcaggggc tcgcgccagc cgaactgttc gccaggctca aggcgcgcat gcccgacggc 10020 gaggatctcg tcgtgaccca tggcgatgcc tgcttgccga atatcatggt ggaaaatggc 10080 cgcttttctg gattcatcga ctgtggccgg ctgggtgtgg cggaccgcta tcaggacata 10140 gcgttggcta cccgtgatat tgctgaagag cttggcggcg aatgggctga ccgcttcctc 10200 gtgctttacg gtatcgccgc tcccgattcg cagcgcatcg ccttctatcg ccttcttgac 10260 gagttcttct gagcgggact ctggggttcg aaatgaccga ccaagcgacg cccaacctgc 10320 catcacgatg gccgcaataa aatatcttta ttttcattac atctgtgtgt tggttttttg 10380 tgtgaatcga tagcgataag gatccgcgta tggtgcactc tcagtacaat ctgctctgat 10440 gccgcatagt taagccagcc ccgacacccg ccaacacccg ctgacgcgcc ctgacgggct 10500 tgtctgctcc cggcatccgc ttacagacaa gctgtgaccg tctccgggag ctgcatgtgt 10560 cagaggtttt caccgtcatc accgaaacgc gcgagacgaa agggcctcgt gatacgccta 10620 tttttatagg ttaatgtcat gataataatg gtttcttaga cgtcaggtgg cacttttcgg 10680 ggaaatgtgc gcggaacccc tatttgttta tttttctaaa tacattcaaa tatgtatccg 10740 ctcatgagac aataaccctg ataaatgctt caataatatt gaaaaaggaa gagtatgagt 10800 attcaacatt tccgtgtcgc ccttattccc ttttttgcgg cattttgcct tcctgttttt 10860 gctcacccag aaacgctggt gaaagtaaaa gatgctgaag atcagttggg tgcacgagtg 10920 ggttacatcg aactggatct caacagcggt aagatccttg agagttttcg ccccgaagaa 10980 cgttttccaa tgatgagcac ttttaaagtt ctgctatgtg gcgcggtatt atcccgtatt 11040 gacgccgggc aagagcaact cggtcgccgc atacactatt ctcagaatga cttggttgag 11100 tactcaccag tcacagaaaa gcatcttacg gatggcatga cagtaagaga attatgcagt 11160 gctgccataa ccatgagtga taacactgcg gccaacttac ttctgacaac gatcggagga 11220 ccgaaggagc taaccgcttt tttgcacaac atgggggatc atgtaactcg ccttgatcgt 11280 tgggaaccgg agctgaatga agccatacca aacgacgagc gtgacaccac gatgcctgta 11340 gcaatggcaa caacgttgcg caaactatta actggcgaac tacttactct agcttcccgg 11400 caacaattaa tagactggat ggaggcggat aaagttgcag gaccacttct gcgctcggcc 11460 cttccggctg gctggtttat tgctgataaa tctggagccg gtgagcgtgg gtctcgcggt 11520 atcattgcag cactggggcc agatggtaag ccctcccgta tcgtagttat ctacacgacg 11580 gggagtcagg caactatgga tgaacgaaat agacagatcg ctgagatagg tgcctcactg 11640 attaagcatt ggtaactgtc agaccaagtt tactcatata tactttagat tgatttaaaa 11700 cttcattttt aatttaaaag gatctaggtg aagatccttt ttgataatct catgaccaaa 11760 atcccttaac gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga 11820 tcttcttgag atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg 11880 ctaccagcgg tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact 11940 ggcttcagca gagcgcagat accaaatact gtccttctag tgtagccgta gttaggccac 12000 cacttcaaga actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg 12060 gctgctgcca gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg 12120 gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga 12180 acgacctaca ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc 12240 gaagggagaa aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg 12300 agggagcttc cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc 12360 tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc 12420 agcaacgcgg cctttttacg gttcctggcc ttttgctggc cttttgctca catggctcga 12480 c 12481 70 10112 DNA Artificial Sequence Description of Artificial Sequence pEsynGP 70 tcaatattgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 60 ttggccattg catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120 aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180 gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240 gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300 agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360 ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420 cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480 gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 540 caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600 caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaacaactg 660 cgatcgcccg ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga ggtctatata 720 agcagagctc gtttagtgaa ccgtcagatc actagaagct ttattgcggt agtttatcac 780 agttaaattg ctaacgcagt cagtgcttct gacacaacag tctcgaactt aagctgcagt 840 gactctctta aggtagcctt gcagaagttg gtcgtgaggc actgggcagg taagtatcaa 900 ggttacaaga caggtttaag gagaccaata gaaactgggc ttgtcgagac agagaagact 960 cttgcgtttc tgataggcac ctattggtct tactgacatc cactttgcct ttctctccac 1020 aggtgtccac tcccagttca attacagctc ttaaggctag agtacttaat acgactcact 1080 ataggctaga gaattcgcca ccatgggcga tcccctcacc tggtccaaag ccctgaagaa 1140 actggaaaaa gtcaccgttc agggtagcca aaagcttacc acaggcaatt gcaactgggc 1200 attgtccctg gtggatcttt tccacgacac taatttcgtt aaggagaaag attggcaact 1260 cagagacgtg atccccctct tggaggacgt gacccaaaca ttgtctgggc aggagcgcga 1320 agctttcgag cgcacctggt gggccatcag cgcagtcaaa atggggctgc aaatcaacaa 1380 cgtggttgac ggtaaagcta gctttcaact gctccgcgct aagtacgaga agaaaaccgc 1440 caacaagaaa caatccgaac ctagcgagga gtacccaatt atgatcgacg gcgccggcaa 1500 taggaacttc cgcccactga ctcccagggg ctataccacc tgggtcaaca ccatccagac 1560 aaacggactt ttgaacgaag cctcccagaa cctgttcggc atcctgtctg tggactgcac 1620 ctccgaagaa atgaatgctt ttctcgacgt ggtgccagga caggctggac agaaacagat 1680 cctgctcgat gccattgaca agatcgccga cgactgggat aatcgccacc ccctgccaaa 1740 cgcccctctg gtggctcccc cacaggggcc tatccctatg accgctaggt tcattagggg 1800 actgggggtg ccccgcgaac gccagatgga gccagcattt gaccaattta ggcagaccta 1860 cagacagtgg atcatcgaag ccatgagcga ggggattaaa gtcatgatcg gaaagcccaa 1920 ggcacagaac atcaggcagg gggccaagga accataccct gagtttgtcg acaggcttct 1980 gtcccagatt aaatccgaag gccaccctca ggagatctcc aagttcttga cagacacact 2040 gactatccaa aatgcaaatg aagagtgcag aaacgccatg aggcacctca gacctgaaga 2100 taccctggag gagaaaatgt acgcatgtcg cgacattggc actaccaagc aaaagatgat 2160 gctgctcgcc aaggctctgc aaaccggcct ggctggtcca ttcaaaggag gagcactgaa 2220 gggaggtcca ttgaaagctg cacaaacatg ttataattgt gggaagccag gacatttatc 2280 tagtcaatgt agagcaccta aagtctgttt taaatgtaaa cagcctggac atttctcaaa 2340 gcaatgcaga agtgttccaa aaaacgggaa gcaaggggct caagggaggc cccagaaaca 2400 aactttcccg atacaacaga agagtcagca caacaaatct gttgtacaag agactcctca 2460 gactcaaaat ctgtacccag atctgagcga aataaaaaag gaatacaatg tcaaggagaa 2520 ggatcaagta gaggatctca acctggacag tttgtgggag taacatacaa tctcgagaag 2580 aggcccacta ccatcgtcct gatcaatgac acccctctta atgtgctgct ggacaccgga 2640 gccgacacca gcgttctcac tactgctcac tataacagac tgaaatacag aggaaggaaa 2700 taccagggca caggcatcat cggcgttgga ggcaacgtcg aaaccttttc cactcctgtc 2760 accatcaaaa agaaggggag acacattaaa accagaatgc tggtcgccga catccccgtc 2820 accatccttg gcagagacat tctccaggac ctgggcgcta aactcgtgct ggcacaactg 2880 tctaaggaaa tcaagttccg caagatcgag ctgaaagagg gcacaatggg tccaaaaatc 2940 ccccagtggc ccctgaccaa agagaagctt gagggcgcta aggaaatcgt gcagcgcctg 3000 ctttctgagg gcaagattag cgaggccagc gacaataacc cttacaacag ccccatcttt 3060 gtgattaaga aaaggagcgg caaatggaga ctcctgcagg acctgaggga actcaacaag 3120 accgtccagg tcggaactga gatctctcgc ggactgcctc accccggcgg cctgattaaa 3180 tgcaagcaca tgacagtcct tgacattgga gacgcttatt ttaccatccc cctcgatcct 3240 gaatttcgcc cctatactgc ttttaccatc cccagcatca atcaccagga gcccgataaa 3300 cgctatgtgt ggaagtgcct cccccaggga tttgtgctta gcccctacat ttaccagaag 3360 acacttcaag agatcctcca acctttccgc gaaagatacc cagaggttca actctaccaa 3420 tatatggacg acctgttcat ggggtccaac gggtctaaga agcagcacaa ggaactcatc 3480 atcgaactga gggcaatcct cctggagaaa ggcttcgaga cacccgacga caagctgcaa 3540 gaagttcctc catatagctg gctgggctac cagctttgcc ctgaaaactg gaaagtccag 3600 aagatgcagt tggatatggt caagaaccca acactgaacg acgtccagaa gctcatgggc 3660 aatattacct ggatgagctc cggaatccct gggcttaccg ttaagcacat tgccgcaact 3720 acaaaaggat gcctggagtt gaaccagaag gtcatttgga cagaggaagc tcagaaggaa 3780 ctggaggaga ataatgaaaa gattaagaat gctcaagggc tccaatacta caatcccgaa 3840 gaagaaatgt tgtgcgaggt cgaaatcact aagaactacg aagccaccta tgtcatcaaa 3900 cagtcccaag gcatcttgtg ggccggaaag aaaatcatga aggccaacaa aggctggtcc 3960 accgttaaaa atctgatgct cctgctccag cacgtcgcca ccgagtctat cacccgcgtc 4020 ggcaagtgcc ccaccttcaa agttcccttc actaaggagc aggtgatgtg ggagatgcaa 4080 aaaggctggt actactcttg gcttcccgag atcgtctaca cccaccaagt ggtgcacgac 4140 gactggagaa tgaagcttgt cgaggagccc actagcggaa ttacaatcta taccgacggc 4200 ggaaagcaaa acggagaggg aatcgctgca tacgtcacat ctaacggccg caccaagcaa 4260 aagaggctcg gccctgtcac tcaccaggtg gctgagagga tggctatcca gatggccctt 4320 gaggacacta gagacaagca ggtgaacatt gtgactgaca gctactactg ctggaaaaac 4380 atcacagagg gccttggcct ggagggaccc cagtctccct ggtggcctat catccagaat 4440 atccgcgaaa aggaaattgt ctatttcgcc tgggtgcctg gacacaaagg aatttacggc 4500 aaccaactcg ccgatgaagc cgccaaaatt aaagaggaaa tcatgcttgc ctaccagggc 4560 acacagatta aggagaagag agacgaggac gctggctttg acctgtgtgt gccatacgac 4620 atcatgattc ccgttagcga cacaaagatc attccaaccg atgtcaagat ccaggtgcca 4680 cccaattcat ttggttgggt gaccggaaag tccagcatgg ctaagcaggg tcttctgatt 4740 aacgggggaa tcattgatga aggatacacc ggcgaaatcc aggtgatctg cacaaatatc 4800 ggcaaaagca atattaagct tatcgaaggg cagaagttcg ctcaactcat catcctccag 4860 caccacagca attcaagaca accttgggac gaaaacaaga ttagccagag aggtgacaag 4920 ggcttcggca gcacaggtgt gttctgggtg gagaacatcc aggaagcaca ggacgagcac 4980 gagaattggc acacctcccc taagattttg gcccgcaatt acaagatccc actgactgtg 5040 gctaagcaga tcacacagga atgcccccac tgcaccaaac aaggttctgg ccccgccggc 5100 tgcgtgatga ggtcccccaa tcactggcag gcagattgca cccacctcga caacaaaatt 5160 atcctgacct tcgtggagag caattccggc tacatccacg caacactcct ctccaaggaa 5220 aatgcattgt gcacctccct cgcaattctg gaatgggcca ggctgttctc tccaaaatcc 5280 ctgcacaccg acaacggcac caactttgtg gctgaacctg tggtgaatct gctgaagttc 5340 ctgaaaatcg cccacaccac tggcattccc tatcaccctg aaagccaggg cattgtcgag 5400 agggccaaca gaactctgaa agaaaagatc caatctcaca gagacaatac acagacattg 5460 gaggccgcac ttcagctcgc ccttatcacc tgcaacaaag gaagagaaag catgggcggc 5520 cagaccccct gggaggtctt catcactaac caggcccagg tcatccatga aaagctgctc 5580 ttgcagcagg cccagtcctc caaaaagttc tgcttttata agatccccgg tgagcacgac 5640 tggaaaggtc ctacaagagt tttgtggaaa ggagacggcg cagttgtggt gaacgatgag 5700 ggcaagggga tcatcgctgt gcccctgaca cgcaccaagc ttctcatcaa gccaaactga 5760 acccggggcg gccgcttccc tttagtgagg gttaatgctt cgagcagaca tgataagata 5820 cattgatgag tttggacaaa ccacaactag aatgcagtga aaaaaatgct ttatttgtga 5880 aatttgtgat gctattgctt tatttgtaac cattataagc tgcaataaac aagttaacaa 5940 caacaattgc attcatttta tgtttcaggt tcagggggag atgtgggagg ttttttaaag 6000 caagtaaaac ctctacaaat gtggtaaaat ccgataagga tcgatccggg ctggcgtaat 6060 agcgaagagg cccgcaccga tcgcccttcc caacagttgc gcagcctgaa tggcgaatgg 6120 acgcgccctg tagcggcgca ttaagcgcgg cgggtgtggt ggttacgcgc agcgtgaccg 6180 ctacacttgc cagcgcccta gcgcccgctc ctttcgcttt cttcccttcc tttctcgcca 6240 cgttcgccgg ctttccccgt caagctctaa atcgggggct ccctttaggg ttccgattta 6300 gagctttacg gcacctcgac cgcaaaaaac ttgatttggg tgatggttca cgtagtgggc 6360 catcgccctg atagacggtt tttcgccctt tgacgttgga gtccacgttc tttaatagtg 6420 gactcttgtt ccaaactgga acaacactca accctatctc ggtctattct tttgatttat 6480 aagggatttt gccgatttcg gcctattggt taaaaaatga gctgatttaa caaatattta 6540 acgcgaattt taacaaaata ttaacgttta caatttcgcc tgatgcggta ttttctcctt 6600 acgcatctgt gcggtatttc acaccgcata cgcggatctg cgcagcacca tggcctgaaa 6660 taacctctga aagaggaact tggttaggta ccttctgagg cggaaagaac cagctgtgga 6720 atgtgtgtca gttagggtgt ggaaagtccc caggctcccc agcaggcaga agtatgcaaa 6780 gcatgcatct caattagtca gcaaccaggt gtggaaagtc cccaggctcc ccagcaggca 6840 gaagtatgca aagcatgcat ctcaattagt cagcaaccat agtcccgccc ctaactccgc 6900 ccatcccgcc cctaactccg cccagttccg cccattctcc gccccatggc tgactaattt 6960 tttttattta tgcagaggcc gaggccgcct cggcctctga gctattccag aagtagtgag 7020 gaggcttttt tggaggccta ggcttttgca aaaagcttga ttcttctgac acaacagtct 7080 cgaacttaag gctagagcca ccatgattga acaagatgga ttgcacgcag gttctccggc 7140 cgcttgggtg gagaggctat tcggctatga ctgggcacaa cagacaatcg gctgctctga 7200 tgccgccgtg ttccggctgt cagcgcaggg gcgcccggtt ctttttgtca agaccgacct 7260 gtccggtgcc ctgaatgaac tgcaggacga ggcagcgcgg ctatcgtggc tggccacgac 7320 gggcgttcct tgcgcagctg tgctcgacgt tgtcactgaa gcgggaaggg actggctgct 7380 attgggcgaa gtgccggggc aggatctcct gtcatctcac cttgctcctg ccgagaaagt 7440 atccatcatg gctgatgcaa tgcggcggct gcatacgctt gatccggcta cctgcccatt 7500 cgaccaccaa gcgaaacatc gcatcgagcg agcacgtact cggatggaag ccggtcttgt 7560 cgatcaggat gatctggacg aagagcatca ggggctcgcg ccagccgaac tgttcgccag 7620 gctcaaggcg cgcatgcccg acggcgagga tctcgtcgtg acccatggcg atgcctgctt 7680 gccgaatatc atggtggaaa atggccgctt ttctggattc atcgactgtg gccggctggg 7740 tgtggcggac cgctatcagg acatagcgtt ggctacccgt gatattgctg aagagcttgg 7800 cggcgaatgg gctgaccgct tcctcgtgct ttacggtatc gccgctcccg attcgcagcg 7860 catcgccttc tatcgccttc ttgacgagtt cttctgagcg ggactctggg gttcgaaatg 7920 accgaccaag cgacgcccaa cctgccatca cgatggccgc aataaaatat ctttattttc 7980 attacatctg tgtgttggtt ttttgtgtga atcgatagcg ataaggatcc gcgtatggtg 8040 cactctcagt acaatctgct ctgatgccgc atagttaagc cagccccgac acccgccaac 8100 acccgctgac gcgccctgac gggcttgtct gctcccggca tccgcttaca gacaagctgt 8160 gaccgtctcc gggagctgca tgtgtcagag gttttcaccg tcatcaccga aacgcgcgag 8220 acgaaagggc ctcgtgatac gcctattttt ataggttaat gtcatgataa taatggtttc 8280 ttagacgtca ggtggcactt ttcggggaaa tgtgcgcgga acccctattt gtttattttt 8340 ctaaatacat tcaaatatgt atccgctcat gagacaataa ccctgataaa tgcttcaata 8400 atattgaaaa aggaagagta tgagtattca acatttccgt gtcgccctta ttcccttttt 8460 tgcggcattt tgccttcctg tttttgctca cccagaaacg ctggtgaaag taaaagatgc 8520 tgaagatcag ttgggtgcac gagtgggtta catcgaactg gatctcaaca gcggtaagat 8580 ccttgagagt tttcgccccg aagaacgttt tccaatgatg agcactttta aagttctgct 8640 atgtggcgcg gtattatccc gtattgacgc cgggcaagag caactcggtc gccgcataca 8700 ctattctcag aatgacttgg ttgagtactc accagtcaca gaaaagcatc ttacggatgg 8760 catgacagta agagaattat gcagtgctgc cataaccatg agtgataaca ctgcggccaa 8820 cttacttctg acaacgatcg gaggaccgaa ggagctaacc gcttttttgc acaacatggg 8880 ggatcatgta actcgccttg atcgttggga accggagctg aatgaagcca taccaaacga 8940 cgagcgtgac accacgatgc ctgtagcaat ggcaacaacg ttgcgcaaac tattaactgg 9000 cgaactactt actctagctt cccggcaaca attaatagac tggatggagg cggataaagt 9060 tgcaggacca cttctgcgct cggcccttcc ggctggctgg tttattgctg ataaatctgg 9120 agccggtgag cgtgggtctc gcggtatcat tgcagcactg gggccagatg gtaagccctc 9180 ccgtatcgta gttatctaca cgacggggag tcaggcaact atggatgaac gaaatagaca 9240 gatcgctgag ataggtgcct cactgattaa gcattggtaa ctgtcagacc aagtttactc 9300 atatatactt tagattgatt taaaacttca tttttaattt aaaaggatct aggtgaagat 9360 cctttttgat aatctcatga ccaaaatccc ttaacgtgag ttttcgttcc actgagcgtc 9420 agaccccgta gaaaagatca aaggatcttc ttgagatcct ttttttctgc gcgtaatctg 9480 ctgcttgcaa acaaaaaaac caccgctacc agcggtggtt tgtttgccgg atcaagagct 9540 accaactctt tttccgaagg taactggctt cagcagagcg cagataccaa atactgtcct 9600 tctagtgtag ccgtagttag gccaccactt caagaactct gtagcaccgc ctacatacct 9660 cgctctgcta atcctgttac cagtggctgc tgccagtggc gataagtcgt gtcttaccgg 9720 gttggactca agacgatagt taccggataa ggcgcagcgg tcgggctgaa cggggggttc 9780 gtgcacacag cccagcttgg agcgaacgac ctacaccgaa ctgagatacc tacagcgtga 9840 gctatgagaa agcgccacgc ttcccgaagg gagaaaggcg gacaggtatc cggtaagcgg 9900 cagggtcgga acaggagagc gcacgaggga gcttccaggg ggaaacgcct ggtatcttta 9960 tagtcctgtc gggtttcgcc acctctgact tgagcgtcga tttttgtgat gctcgtcagg 10020 ggggcggagc ctatggaaaa acgccagcaa cgcggccttt ttacggttcc tggccttttg 10080 ctggcctttt gctcacatgg ctcgacagat ct 10112 71 10114 DNA Artificial Sequence Description of Artificial Sequence pESDSYNGP 71 tcaatattgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 60 ttggccattg catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120 aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180 gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240 gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300 agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360 ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420 cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480 gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 540 caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600 caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaacaactg 660 cgatcgcccg ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga ggtctatata 720 agcagagctc gtttagtgaa ccgtcagatc actagaagct ttattgcggt agtttatcac 780 agttaaattg ctaacgcagt cagtgcttct gacacaacag tctcgaactt aagctgcagt 840 gactctctta aggtagcctt gcagaagttg gtcgtgaggc actgggcagg taagtatcaa 900 ggttacaaga caggtttaag gagaccaata gaaactgggc ttgtcgagac agagaagact 960 cttgcgtttc tgataggcac ctattggtct tactgacatc cactttgcct ttctctccac 1020 aggtgtccac tcccagttca attacagctc ttaaggctag agtacttaat acgactcact 1080 ataggctaga gaattccagg taagatgggc gatcccctca cctggtccaa agccctgaag 1140 aaactggaaa aagtcaccgt tcagggtagc caaaagctta ccacaggcaa ttgcaactgg 1200 gcattgtccc tggtggatct tttccacgac actaatttcg ttaaggagaa agattggcaa 1260 ctcagagacg tgatccccct cttggaggac gtgacccaaa cattgtctgg gcaggagcgc 1320 gaagctttcg agcgcacctg gtgggccatc agcgcagtca aaatggggct gcaaatcaac 1380 aacgtggttg acggtaaagc tagctttcaa ctgctccgcg ctaagtacga gaagaaaacc 1440 gccaacaaga aacaatccga acctagcgag gagtacccaa ttatgatcga cggcgccggc 1500 aataggaact tccgcccact gactcccagg ggctatacca cctgggtcaa caccatccag 1560 acaaacggac ttttgaacga agcctcccag aacctgttcg gcatcctgtc tgtggactgc 1620 acctccgaag aaatgaatgc ttttctcgac gtggtgccag gacaggctgg acagaaacag 1680 atcctgctcg atgccattga caagatcgcc gacgactggg ataatcgcca ccccctgcca 1740 aacgcccctc tggtggctcc cccacagggg cctatcccta tgaccgctag gttcattagg 1800 ggactggggg tgccccgcga acgccagatg gagccagcat ttgaccaatt taggcagacc 1860 tacagacagt ggatcatcga agccatgagc gaggggatta aagtcatgat cggaaagccc 1920 aaggcacaga acatcaggca gggggccaag gaaccatacc ctgagtttgt cgacaggctt 1980 ctgtcccaga ttaaatccga aggccaccct caggagatct ccaagttctt gacagacaca 2040 ctgactatcc aaaatgcaaa tgaagagtgc agaaacgcca tgaggcacct cagacctgaa 2100 gataccctgg aggagaaaat gtacgcatgt cgcgacattg gcactaccaa gcaaaagatg 2160 atgctgctcg ccaaggctct gcaaaccggc ctggctggtc cattcaaagg aggagcactg 2220 aagggaggtc cattgaaagc tgcacaaaca tgttataatt gtgggaagcc aggacattta 2280 tctagtcaat gtagagcacc taaagtctgt tttaaatgta aacagcctgg acatttctca 2340 aagcaatgca gaagtgttcc aaaaaacggg aagcaagggg ctcaagggag gccccagaaa 2400 caaactttcc cgatacaaca gaagagtcag cacaacaaat ctgttgtaca agagactcct 2460 cagactcaaa atctgtaccc agatctgagc gaaataaaaa aggaatacaa tgtcaaggag 2520 aaggatcaag tagaggatct caacctggac agtttgtggg agtaacatac aatctcgaga 2580 agaggcccac taccatcgtc ctgatcaatg acacccctct taatgtgctg ctggacaccg 2640 gagccgacac cagcgttctc actactgctc actataacag actgaaatac agaggaagga 2700 aataccaggg cacaggcatc atcggcgttg gaggcaacgt cgaaaccttt tccactcctg 2760 tcaccatcaa aaagaagggg agacacatta aaaccagaat gctggtcgcc gacatccccg 2820 tcaccatcct tggcagagac attctccagg acctgggcgc taaactcgtg ctggcacaac 2880 tgtctaagga aatcaagttc cgcaagatcg agctgaaaga gggcacaatg ggtccaaaaa 2940 tcccccagtg gcccctgacc aaagagaagc ttgagggcgc taaggaaatc gtgcagcgcc 3000 tgctttctga gggcaagatt agcgaggcca gcgacaataa cccttacaac agccccatct 3060 ttgtgattaa gaaaaggagc ggcaaatgga gactcctgca ggacctgagg gaactcaaca 3120 agaccgtcca ggtcggaact gagatctctc gcggactgcc tcaccccggc ggcctgatta 3180 aatgcaagca catgacagtc cttgacattg gagacgctta ttttaccatc cccctcgatc 3240 ctgaatttcg cccctatact gcttttacca tccccagcat caatcaccag gagcccgata 3300 aacgctatgt gtggaagtgc ctcccccagg gatttgtgct tagcccctac atttaccaga 3360 agacacttca agagatcctc caacctttcc gcgaaagata cccagaggtt caactctacc 3420 aatatatgga cgacctgttc atggggtcca acgggtctaa gaagcagcac aaggaactca 3480 tcatcgaact gagggcaatc ctcctggaga aaggcttcga gacacccgac gacaagctgc 3540 aagaagttcc tccatatagc tggctgggct accagctttg ccctgaaaac tggaaagtcc 3600 agaagatgca gttggatatg gtcaagaacc caacactgaa cgacgtccag aagctcatgg 3660 gcaatattac ctggatgagc tccggaatcc ctgggcttac cgttaagcac attgccgcaa 3720 ctacaaaagg atgcctggag ttgaaccaga aggtcatttg gacagaggaa gctcagaagg 3780 aactggagga gaataatgaa aagattaaga atgctcaagg gctccaatac tacaatcccg 3840 aagaagaaat gttgtgcgag gtcgaaatca ctaagaacta cgaagccacc tatgtcatca 3900 aacagtccca aggcatcttg tgggccggaa agaaaatcat gaaggccaac aaaggctggt 3960 ccaccgttaa aaatctgatg ctcctgctcc agcacgtcgc caccgagtct atcacccgcg 4020 tcggcaagtg ccccaccttc aaagttccct tcactaagga gcaggtgatg tgggagatgc 4080 aaaaaggctg gtactactct tggcttcccg agatcgtcta cacccaccaa gtggtgcacg 4140 acgactggag aatgaagctt gtcgaggagc ccactagcgg aattacaatc tataccgacg 4200 gcggaaagca aaacggagag ggaatcgctg catacgtcac atctaacggc cgcaccaagc 4260 aaaagaggct cggccctgtc actcaccagg tggctgagag gatggctatc cagatggccc 4320 ttgaggacac tagagacaag caggtgaaca ttgtgactga cagctactac tgctggaaaa 4380 acatcacaga gggccttggc ctggagggac cccagtctcc ctggtggcct atcatccaga 4440 atatccgcga aaaggaaatt gtctatttcg cctgggtgcc tggacacaaa ggaatttacg 4500 gcaaccaact cgccgatgaa gccgccaaaa ttaaagagga aatcatgctt gcctaccagg 4560 gcacacagat taaggagaag agagacgagg acgctggctt tgacctgtgt gtgccatacg 4620 acatcatgat tcccgttagc gacacaaaga tcattccaac cgatgtcaag atccaggtgc 4680 cacccaattc atttggttgg gtgaccggaa agtccagcat ggctaagcag ggtcttctga 4740 ttaacggggg aatcattgat gaaggataca ccggcgaaat ccaggtgatc tgcacaaata 4800 tcggcaaaag caatattaag cttatcgaag ggcagaagtt cgctcaactc atcatcctcc 4860 agcaccacag caattcaaga caaccttggg acgaaaacaa gattagccag agaggtgaca 4920 agggcttcgg cagcacaggt gtgttctggg tggagaacat ccaggaagca caggacgagc 4980 acgagaattg gcacacctcc cctaagattt tggcccgcaa ttacaagatc ccactgactg 5040 tggctaagca gatcacacag gaatgccccc actgcaccaa acaaggttct ggccccgccg 5100 gctgcgtgat gaggtccccc aatcactggc aggcagattg cacccacctc gacaacaaaa 5160 ttatcctgac cttcgtggag agcaattccg gctacatcca cgcaacactc ctctccaagg 5220 aaaatgcatt gtgcacctcc ctcgcaattc tggaatgggc caggctgttc tctccaaaat 5280 ccctgcacac cgacaacggc accaactttg tggctgaacc tgtggtgaat ctgctgaagt 5340 tcctgaaaat cgcccacacc actggcattc cctatcaccc tgaaagccag ggcattgtcg 5400 agagggccaa cagaactctg aaagaaaaga tccaatctca cagagacaat acacagacat 5460 tggaggccgc acttcagctc gcccttatca cctgcaacaa aggaagagaa agcatgggcg 5520 gccagacccc ctgggaggtc ttcatcacta accaggccca ggtcatccat gaaaagctgc 5580 tcttgcagca ggcccagtcc tccaaaaagt tctgctttta taagatcccc ggtgagcacg 5640 actggaaagg tcctacaaga gttttgtgga aaggagacgg cgcagttgtg gtgaacgatg 5700 agggcaaggg gatcatcgct gtgcccctga cacgcaccaa gcttctcatc aagccaaact 5760 gaacccgggg cggccgcttc cctttagtga gggttaatgc ttcgagcaga catgataaga 5820 tacattgatg agtttggaca aaccacaact agaatgcagt gaaaaaaatg ctttatttgt 5880 gaaatttgtg atgctattgc tttatttgta accattataa gctgcaataa acaagttaac 5940 aacaacaatt gcattcattt tatgtttcag gttcaggggg agatgtggga ggttttttaa 6000 agcaagtaaa acctctacaa atgtggtaaa atccgataag gatcgatccg ggctggcgta 6060 atagcgaaga ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat 6120 ggacgcgccc tgtagcggcg cattaagcgc ggcgggtgtg gtggttacgc gcagcgtgac 6180 cgctacactt gccagcgccc tagcgcccgc tcctttcgct ttcttccctt cctttctcgc 6240 cacgttcgcc ggctttcccc gtcaagctct aaatcggggg ctccctttag ggttccgatt 6300 tagagcttta cggcacctcg accgcaaaaa acttgatttg ggtgatggtt cacgtagtgg 6360 gccatcgccc tgatagacgg tttttcgccc tttgacgttg gagtccacgt tctttaatag 6420 tggactcttg ttccaaactg gaacaacact caaccctatc tcggtctatt cttttgattt 6480 ataagggatt ttgccgattt cggcctattg gttaaaaaat gagctgattt aacaaatatt 6540 taacgcgaat tttaacaaaa tattaacgtt tacaatttcg cctgatgcgg tattttctcc 6600 ttacgcatct gtgcggtatt tcacaccgca tacgcggatc tgcgcagcac catggcctga 6660 aataacctct gaaagaggaa cttggttagg taccttctga ggcggaaaga accagctgtg 6720 gaatgtgtgt cagttagggt gtggaaagtc cccaggctcc ccagcaggca gaagtatgca 6780 aagcatgcat ctcaattagt cagcaaccag gtgtggaaag tccccaggct ccccagcagg 6840 cagaagtatg caaagcatgc atctcaatta gtcagcaacc atagtcccgc ccctaactcc 6900 gcccatcccg cccctaactc cgcccagttc cgcccattct ccgccccatg gctgactaat 6960 tttttttatt tatgcagagg ccgaggccgc ctcggcctct gagctattcc agaagtagtg 7020 aggaggcttt tttggaggcc taggcttttg caaaaagctt gattcttctg acacaacagt 7080 ctcgaactta aggctagagc caccatgatt gaacaagatg gattgcacgc aggttctccg 7140 gccgcttggg tggagaggct attcggctat gactgggcac aacagacaat cggctgctct 7200 gatgccgccg tgttccggct gtcagcgcag gggcgcccgg ttctttttgt caagaccgac 7260 ctgtccggtg ccctgaatga actgcaggac gaggcagcgc ggctatcgtg gctggccacg 7320 acgggcgttc cttgcgcagc tgtgctcgac gttgtcactg aagcgggaag ggactggctg 7380 ctattgggcg aagtgccggg gcaggatctc ctgtcatctc accttgctcc tgccgagaaa 7440 gtatccatca tggctgatgc aatgcggcgg ctgcatacgc ttgatccggc tacctgccca 7500 ttcgaccacc aagcgaaaca tcgcatcgag cgagcacgta ctcggatgga agccggtctt 7560 gtcgatcagg atgatctgga cgaagagcat caggggctcg cgccagccga actgttcgcc 7620 aggctcaagg cgcgcatgcc cgacggcgag gatctcgtcg tgacccatgg cgatgcctgc 7680 ttgccgaata tcatggtgga aaatggccgc ttttctggat tcatcgactg tggccggctg 7740 ggtgtggcgg accgctatca ggacatagcg ttggctaccc gtgatattgc tgaagagctt 7800 ggcggcgaat gggctgaccg cttcctcgtg ctttacggta tcgccgctcc cgattcgcag 7860 cgcatcgcct tctatcgcct tcttgacgag ttcttctgag cgggactctg gggttcgaaa 7920 tgaccgacca agcgacgccc aacctgccat cacgatggcc gcaataaaat atctttattt 7980 tcattacatc tgtgtgttgg ttttttgtgt gaatcgatag cgataaggat ccgcgtatgg 8040 tgcactctca gtacaatctg ctctgatgcc gcatagttaa gccagccccg acacccgcca 8100 acacccgctg acgcgccctg acgggcttgt ctgctcccgg catccgctta cagacaagct 8160 gtgaccgtct ccgggagctg catgtgtcag aggttttcac cgtcatcacc gaaacgcgcg 8220 agacgaaagg gcctcgtgat acgcctattt ttataggtta atgtcatgat aataatggtt 8280 tcttagacgt caggtggcac ttttcgggga aatgtgcgcg gaacccctat ttgtttattt 8340 ttctaaatac attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa 8400 taatattgaa aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt 8460 tttgcggcat tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat 8520 gctgaagatc agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag 8580 atccttgaga gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg 8640 ctatgtggcg cggtattatc ccgtattgac gccgggcaag agcaactcgg tcgccgcata 8700 cactattctc agaatgactt ggttgagtac tcaccagtca cagaaaagca tcttacggat 8760 ggcatgacag taagagaatt atgcagtgct gccataacca tgagtgataa cactgcggcc 8820 aacttacttc tgacaacgat cggaggaccg aaggagctaa ccgctttttt gcacaacatg 8880 ggggatcatg taactcgcct tgatcgttgg gaaccggagc tgaatgaagc cataccaaac 8940 gacgagcgtg acaccacgat gcctgtagca atggcaacaa cgttgcgcaa actattaact 9000 ggcgaactac ttactctagc ttcccggcaa caattaatag actggatgga ggcggataaa 9060 gttgcaggac cacttctgcg ctcggccctt ccggctggct ggtttattgc tgataaatct 9120 ggagccggtg agcgtgggtc tcgcggtatc attgcagcac tggggccaga tggtaagccc 9180 tcccgtatcg tagttatcta cacgacgggg agtcaggcaa ctatggatga acgaaataga 9240 cagatcgctg agataggtgc ctcactgatt aagcattggt aactgtcaga ccaagtttac 9300 tcatatatac tttagattga tttaaaactt catttttaat ttaaaaggat ctaggtgaag 9360 atcctttttg ataatctcat gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg 9420 tcagaccccg tagaaaagat caaaggatct tcttgagatc ctttttttct gcgcgtaatc 9480 tgctgcttgc aaacaaaaaa accaccgcta ccagcggtgg tttgtttgcc ggatcaagag 9540 ctaccaactc tttttccgaa ggtaactggc ttcagcagag cgcagatacc aaatactgtc 9600 cttctagtgt agccgtagtt aggccaccac ttcaagaact ctgtagcacc gcctacatac 9660 ctcgctctgc taatcctgtt accagtggct gctgccagtg gcgataagtc gtgtcttacc 9720 gggttggact caagacgata gttaccggat aaggcgcagc ggtcgggctg aacggggggt 9780 tcgtgcacac agcccagctt ggagcgaacg acctacaccg aactgagata cctacagcgt 9840 gagctatgag aaagcgccac gcttcccgaa gggagaaagg cggacaggta tccggtaagc 9900 ggcagggtcg gaacaggaga gcgcacgagg gagcttccag ggggaaacgc ctggtatctt 9960 tatagtcctg tcgggtttcg ccacctctga cttgagcgtc gatttttgtg atgctcgtca 10020 ggggggcgga gcctatggaa aaacgccagc aacgcggcct ttttacggtt cctggccttt 10080 tgctggcctt ttgctcacat ggctcgacag atct 10114 72 11131 DNA Artificial Sequence Description of Artificial Sequence pONY4.0Z 72 ctaaattgta agcgttaata ttttgttaaa attcgcgtta aatttttgtt aaatcagctc 60 attttttaac caataggccg aaatcggcaa aatcccttat aaatcaaaag aatagaccga 120 gatagggttg agtgttgttc cagtttggaa caagagtcca ctattaaaga acgtggactc 180 caacgtcaaa gggcgaaaaa ccgtctatca gggcgatggc ccactacgtg aaccatcacc 240 ctaatcaagt tttttggggt cgaggtgccg taaagcacta aatcggaacc ctaaagggag 300 cccccgattt agagcttgac ggggaaagcc aacctggctt atcgaaatta atacgactca 360 ctatagggag accggcagat cttgaataat aaaatgtgtg tttgtccgaa atacgcgttt 420 tgagatttct gtcgccgact aaattcatgt cgcgcgatag tggtgtttat cgccgataga 480 gatggcgata ttggaaaaat tgatatttga aaatatggca tattgaaaat gtcgccgatg 540 tgagtttctg tgtaactgat atcgccattt ttccaaaagt gatttttggg catacgcgat 600 atctggcgat agcgcttata tcgtttacgg gggatggcga tagacgactt tggtgacttg 660 ggcgattctg tgtgtcgcaa atatcgcagt ttcgatatag gtgacagacg atatgaggct 720 atatcgccga tagaggcgac atcaagctgg cacatggcca atgcatatcg atctatacat 780 tgaatcaata ttggccatta gccatattat tcattggtta tatagcataa atcaatattg 840 gctattggcc attgcatacg ttgtatccat atcgtaatat gtacatttat attggctcat 900 gtccaacatt accgccatgt tgacattgat tattgactag ttattaatag taatcaatta 960 cggggtcatt agttcatagc ccatatatgg agttccgcgt tacataactt acggtaaatg 1020 gcccgcctgg ctgaccgccc aacgaccccc gcccattgac gtcaataatg acgtatgttc 1080 ccatagtaac gccaataggg actttccatt gacgtcaatg ggtggagtat ttacggtaaa 1140 ctgcccactt ggcagtacat caagtgtatc atatgccaag tccgccccct attgacgtca 1200 atgacggtaa atggcccgcc tggcattatg cccagtacat gaccttacgg gactttccta 1260 cttggcagta catctacgta ttagtcatcg ctattaccat ggtgatgcgg ttttggcagt 1320 acaccaatgg gcgtggatag cggtttgact cacggggatt tccaagtctc caccccattg 1380 acgtcaatgg gagtttgttt tggcaccaaa atcaacggga ctttccaaaa tgtcgtaaca 1440 actgcgatcg cccgccccgt tgacgcaaat gggcggtagg cgtgtacggt gggaggtcta 1500 tataagcaga gctcgtttag tgaaccgggc actcagattc tgcggtctga gtcccttctc 1560 tgctgggctg aaaaggcctt tgtaataaat ataattctct actcagtccc tgtctctagt 1620 ttgtctgttc gagatcctac agttggcgcc cgaacaggga cctgagaggg gcgcagaccc 1680 tacctgttga acctggctga tcgtaggatc cccgggacag cagaggagaa cttacagaag 1740 tcttctggag gtgttcctgg ccagaacaca ggaggacagg taagatggga gaccctttga 1800 catggagcaa ggcgctcaag aagttagaga aggtgacggt acaagggtct cagaaattaa 1860 ctactggtaa ctgtaattgg gcgctaagtc tagtagactt atttcatgat accaactttg 1920 taaaagaaaa ggactggcag ctgagggatg tcattccatt gctggaagat gtaactcaga 1980 cgctgtcagg acaagaaaga gaggcctttg aaagaacatg gtgggcaatt tctgctgtaa 2040 agatgggcct ccagattaat aatgtagtag atggaaaggc atcattccag ctcctaagag 2100 cgaaatatga aaagaagact gctaataaaa agcagtctga gccctctgaa gaatatctct 2160 agaactagtg gatcccccgg gctgcaggag tggggaggca cgatggccgc tttggtcgag 2220 gcggatccgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 2280 ttggccattg catacgttgt atccatatca taatatgtac atttatattg gctcatgtcc 2340 aacattaccg ccatgttgac attgattatt gactagttat taatagtaat caattacggg 2400 gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 2460 gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 2520 agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 2580 ccacttggca gtacatcaag tgtatcatat gccaagtacg ccccctattg acgtcaatga 2640 cggtaaatgg cccgcctggc attatgccca gtacatgacc ttatgggact ttcctacttg 2700 gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacat 2760 caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 2820 caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaacaactc 2880 cgccccattg acgcaaatgg gcggtaggca tgtacggtgg gaggtctata taagcagagc 2940 tcgtttagtg aaccgtcaga tcgcctggag acgccatcca cgctgttttg acctccatag 3000 aagacaccgg gaccgatcca gcctccgcgg ccccaagctt cagctgctcg aggatctgcg 3060 gatccgggga attccccagt ctcaggatcc accatggggg atcccgtcgt tttacaacgt 3120 cgtgactggg aaaaccctgg cgttacccaa cttaatcgcc ttgcagcaca tccccctttc 3180 gccagctggc gtaatagcga agaggcccgc accgatcgcc cttcccaaca gttgcgcagc 3240 ctgaatggcg aatggcgctt tgcctggttt ccggcaccag aagcggtgcc ggaaagctgg 3300 ctggagtgcg atcttcctga ggccgatact gtcgtcgtcc cctcaaactg gcagatgcac 3360 ggttacgatg cgcccatcta caccaacgta acctatccca ttacggtcaa tccgccgttt 3420 gttcccacgg agaatccgac gggttgttac tcgctcacat ttaatgttga tgaaagctgg 3480 ctacaggaag gccagacgcg aattattttt gatggcgtta actcggcgtt tcatctgtgg 3540 tgcaacgggc gctgggtcgg ttacggccag gacagtcgtt tgccgtctga atttgacctg 3600 agcgcatttt tacgcgccgg agaaaaccgc ctcgcggtga tggtgctgcg ttggagtgac 3660 ggcagttatc tggaagatca ggatatgtgg cggatgagcg gcattttccg tgacgtctcg 3720 ttgctgcata aaccgactac acaaatcagc gatttccatg ttgccactcg ctttaatgat 3780 gatttcagcc gcgctgtact ggaggctgaa gttcagatgt gcggcgagtt gcgtgactac 3840 ctacgggtaa cagtttcttt atggcagggt gaaacgcagg tcgccagcgg caccgcgcct 3900 ttcggcggtg aaattatcga tgagcgtggt ggttatgccg atcgcgtcac actacgtctg 3960 aacgtcgaaa acccgaaact gtggagcgcc gaaatcccga atctctatcg tgcggtggtt 4020 gaactgcaca ccgccgacgg cacgctgatt gaagcagaag cctgcgatgt cggtttccgc 4080 gaggtgcgga ttgaaaatgg tctgctgctg ctgaacggca agccgttgct gattcgaggc 4140 gttaaccgtc acgagcatca tcctctgcat ggtcaggtca tggatgagca gacgatggtg 4200 caggatatcc tgctgatgaa gcagaacaac tttaacgccg tgcgctgttc gcattatccg 4260 aaccatccgc tgtggtacac gctgtgcgac cgctacggcc tgtatgtggt ggatgaagcc 4320 aatattgaaa cccacggcat ggtgccaatg aatcgtctga ccgatgatcc gcgctggcta 4380 ccggcgatga gcgaacgcgt aacgcgaatg gtgcagcgcg atcgtaatca cccgagtgtg 4440 atcatctggt cgctggggaa tgaatcaggc cacggcgcta atcacgacgc gctgtatcgc 4500 tggatcaaat ctgtcgatcc ttcccgcccg gtgcagtatg aaggcggcgg agccgacacc 4560 acggccaccg atattatttg cccgatgtac gcgcgcgtgg atgaagacca gcccttcccg 4620 gctgtgccga aatggtccat caaaaaatgg ctttcgctac ctggagagac gcgcccgctg 4680 atcctttgcg aatacgccca cgcgatgggt aacagtcttg gcggtttcgc taaatactgg 4740 caggcgtttc gtcagtatcc ccgtttacag ggcggcttcg tctgggactg ggtggatcag 4800 tcgctgatta aatatgatga aaacggcaac ccgtggtcgg cttacggcgg tgattttggc 4860 gatacgccga acgatcgcca gttctgtatg aacggtctgg tctttgccga ccgcacgccg 4920 catccagcgc tgacggaagc aaaacaccag cagcagtttt tccagttccg tttatccggg 4980 caaaccatcg aagtgaccag cgaatacctg ttccgtcata gcgataacga gctcctgcac 5040 tggatggtgg cgctggatgg taagccgctg gcaagcggtg aagtgcctct ggatgtcgct 5100 ccacaaggta aacagttgat tgaactgcct gaactaccgc agccggagag cgccgggcaa 5160 ctctggctca cagtacgcgt agtgcaaccg aacgcgaccg catggtcaga agccgggcac 5220 atcagcgcct ggcagcagtg gcgtctggcg gaaaacctca gtgtgacgct ccccgccgcg 5280 tcccacgcca tcccgcatct gaccaccagc gaaatggatt tttgcatcga gctgggtaat 5340 aagcgttggc aatttaaccg ccagtcaggc tttctttcac agatgtggat tggcgataaa 5400 aaacaactgc tgacgccgct gcgcgatcag ttcacccgtg caccgctgga taacgacatt 5460 ggcgtaagtg aagcgacccg cattgaccct aacgcctggg tcgaacgctg gaaggcggcg 5520 ggccattacc aggccgaagc agcgttgttg cagtgcacgg cagatacact tgctgatgcg 5580 gtgctgatta cgaccgctca cgcgtggcag catcagggga aaaccttatt tatcagccgg 5640 aaaacctacc ggattgatgg tagtggtcaa atggcgatta ccgttgatgt tgaagtggcg 5700 agcgatacac cgcatccggc gcggattggc ctgaactgcc agctggcgca ggtagcagag 5760 cgggtaaact ggctcggatt agggccgcaa gaaaactatc ccgaccgcct tactgccgcc 5820 tgttttgacc gctgggatct gccattgtca gacatgtata ccccgtacgt cttcccgagc 5880 gaaaacggtc tgcgctgcgg gacgcgcgaa ttgaattatg gcccacacca gtggcgcggc 5940 gacttccagt tcaacatcag ccgctacagt caacagcaac tgatggaaac cagccatcgc 6000 catctgctgc acgcggaaga aggcacatgg ctgaatatcg acggtttcca tatggggatt 6060 ggtggcgacg actcctggag cccgtcagta tcggcggaat tccagctgag cgccggtcgc 6120 taccattacc agttggtctg gtgtcaaaaa taataataac cgggcagggg ggatccgcag 6180 atccggctgt ggaatgtgtg tcagttaggg tgtggaaagt ccccaggctc cccagcaggc 6240 agaagtatgc aaagcatgcc tgcaggaatt cgatatcaag cttatcgata ccgtcgacct 6300 cgaggggggg cccggtaccc agcttttgtt ccctttagtg agggttaatt gcgcgggaag 6360 tatttatcac taatcaagca caagtaatac atgagaaact tttactacag caagcacaat 6420 cctccaaaaa attttgtttt tacaaaatcc ctggtgaaca tgattggaag ggacctacta 6480 gggtgctgtg gaagggtgat ggtgcagtag tagttaatga tgaaggaaag ggaataattg 6540 ctgtaccatt aaccaggact aagttactaa taaaaccaaa ttgagtattg ttgcaggaag 6600 caagacccaa ctaccattgt cagctgtgtt tcctgaggtc tctaggaatt gattacctcg 6660 atgcttcatt aaggaagaag aataaacaaa gactgaaggc aatccaacaa ggaagacaac 6720 ctcaatattt gttataaggt ttgatatatg ggagtatttg gtaaaggggt aacatggtca 6780 gcatcgcatt ctatggggga atcccagggg gaatctcaac ccctattacc caacagtcag 6840 aaaaatctaa gtgtgaggag aacacaatgt ttcaacctta ttgttataat aatgacagta 6900 agaacagcat ggcagaatcg aaggaagcaa gagaccaaga aatgaacctg aaagaagaat 6960 ctaaagaaga aaaaagaaga aatgactggt ggaaaatagg tatgtttctg ttatgcttag 7020 caggaactac tggaggaata ctttggtggt atgaaggact cccacagcaa cattatatag 7080 ggttggtggc gataggggga agattaaacg gatctggcca atcaaatgct atagaatgct 7140 ggggttcctt cccggggtgt agaccatttc aaaattactt cagttatgag accaatagaa 7200 gcatgcatat ggataataat actgctacat tattagaagc tttaaccaat ataactgctc 7260 tataaataac aaaacagaat tagaaacatg gaagttagta aagacttctg gcataactcc 7320 tttacctatt tcttctgaag ctaacactgg actaattaga cataagagag attttggtat 7380 aagtgcaata gtggcagcta ttgtagccgc tactgctatt gctgctagcg ctactatgtc 7440 ttatgttgct ctaactgagg ttaacaaaat aatggaagta caaaatcata cttttgaggt 7500 agaaaatagt actctaaatg gtatggattt aatagaacga caaataaaga tattatatgc 7560 tatgattctt caaacacatg cagatgttca actgttaaag gaaagacaac aggtagagga 7620 gacatttaat ttaattggat gtatagaaag aacacatgta ttttgtcata ctggtcatcc 7680 ctggaatatg tcatggggac atttaaatga gtcaacacaa tgggatgact gggtaagcaa 7740 aatggaagat ttaaatcaag agatactaac tacacttcat ggagccagga acaatttggc 7800 acaatccatg ataacattca atacaccaga tagtatagct caatttggaa aagacctttg 7860 gagtcatatt ggaaattgga ttcctggatt gggagcttcc attataaaat atatagtgat 7920 gtttttgctt atttatttgt tactaacctc ttcgcctaag atcctcaggg ccctctggaa 7980 ggtgaccagt ggtgcagggt cctccggcag tcgttacctg aagaaaaaat tccatcacaa 8040 acatgcatcg cgagaagaca cctgggacca ggcccaacac aacatacacc tagcaggcgt 8100 gaccggtgga tcaggggaca aatactacaa gcagaagtac tccaggaacg actggaatgg 8160 agaatcagag gagtacaaca ggcggccaaa gagctgggtg aagtcaatcg aggcatttgg 8220 agagagctat atttccgaga agaccaaagg ggagatttct cagcctgggg cggctatcaa 8280 cgagcacaag aacggctctg gggggaacaa tcctcaccaa gggtccttag acctggagat 8340 tcgaagcgaa ggaggaaaca tttatgactg ttgcattaaa gcccaagaag gaactctcgc 8400 tatcccttgc tgtggatttc ccttatggct attttgggga ctagtaatta tagtaggacg 8460 catagcaggc tatggattac gtggactcgc tgttataata aggatttgta ttagaggctt 8520 aaatttgata tttgaaataa tcagaaaaat gcttgattat attggaagag ctttaaatcc 8580 tggcacatct catgtatcaa tgcctcagta tgtttagaaa aacaaggggg gaactgtggg 8640 gtttttatga ggggttttat aaatgattat aagagtaaaa agaaagttgc tgatgctctc 8700 ataaccttgt ataacccaaa ggactagctc atgttgctag gcaactaaac cgcaataacc 8760 gcatttgtga cgcgagttcc ccattggtga cgcgttaact tcctgttttt acagtatata 8820 agtgcttgta ttctgacaat tgggcactca gattctgcgg tctgagtccc ttctctgctg 8880 ggctgaaaag gcctttgtaa taaatataat tctctactca gtccctgtct ctagtttgtc 8940 tgttcgagat cctacagagc tcatgccttg gcgtaatcat ggtcatagct gtttcctgtg 9000 tgaaattgtt atccgctcac aattccacac aacatacgag ccggaagcat aaagtgtaaa 9060 gcctggggtg cctaatgagt gagctaactc acattaattg cgttgcgctc actgcccgct 9120 ttccagtcgg gaaacctgtc gtgccagctg cattaatgaa tcggccaacg cgcggggaga 9180 ggcggtttgc gtattgggcg ctcttccgct tcctcgctca ctgactcgct gcgctcggtc 9240 gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa 9300 tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt 9360 aaaaaggccg cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa 9420 aatcgacgct caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt 9480 ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg 9540 tccgcctttc tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc 9600 agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc 9660 gaccgctgcg ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta 9720 tcgccactgg cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct 9780 acagagttct tgaagtggtg gcctaactac ggctacacta gaaggacagt atttggtatc 9840 tgcgctctgc tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa 9900 caaaccaccg ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa 9960 aaaggatctc aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa 10020 aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt 10080 ttaaattaaa aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac 10140 agttaccaat gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc 10200 atagttgcct gactccccgt cgtgtagata actacgatac gggagggctt accatctggc 10260 cccagtgctg caatgatacc gcgagaccca cgctcaccgg ctccagattt atcagcaata 10320 aaccagccag ccggaagggc cgagcgcaga agtggtcctg caactttatc cgcctccatc 10380 cagtctatta attgttgccg ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc 10440 aacgttgttg ccattgctac aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca 10500 ttcagctccg gttcccaacg atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa 10560 gcggttagct ccttcggtcc tccgatcgtt gtcagaagta agttggccgc agtgttatca 10620 ctcatggtta tggcagcact gcataattct cttactgtca tgccatccgt aagatgcttt 10680 tctgtgactg gtgagtactc aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt 10740 tgctcttgcc cggcgtcaat acgggataat accgcgccac atagcagaac tttaaaagtg 10800 ctcatcattg gaaaacgttc ttcggggcga aaactctcaa ggatcttacc gctgttgaga 10860 tccagttcga tgtaacccac tcgtgcaccc aactgatctt cagcatcttt tactttcacc 10920 agcgtttctg ggtgagcaaa aacaggaagg caaaatgccg caaaaaaggg aataagggcg 10980 acacggaaat gttgaatact catactcttc ctttttcaat attattgaag catttatcag 11040 ggttattgtc tcatgagcgg atacatattt gaatgtattt agaaaaataa acaaataggg 11100 gttccgcgca catttccccg aaaagtgcca c 11131 73 517 DNA Artificial Sequence Description of Artificial Sequence Codon optimised EIAV REV 73 gaattcgcca ccatggctga gagcaaggag gccagggatc aagagatgaa cctcaaggaa 60 gagagcaaag aggagaagcg ccgcaacgac tggtggaaga tcgacccaca aggccccctg 120 gagggggacc agtggtgccg cgtgctgaga cagtccctgc ccgaggagaa gattcctagc 180 cagacctgca tcgccagaag acacctcggc cccggtccca cccagcacac accctccaga 240 agggataggt ggattagggg ccagattttg caagccgagg tcctccaaga aaggctggaa 300 tggagaatta ggggcgtgca acaagccgct aaagagctgg gagaggtgaa tcgcggcatc 360 tggagggagc tctacttccg cgaggaccag aggggcgatt tctccgcatg gggaggctac 420 cagagggcac aagaaaggct gtggggcgag cagagcagcc cccgcgtctt gaggcccgga 480 gactccaaaa gacgccgcaa acacctgtga agtcgac 517 

1. A viral vector comprising a nucleic acid sequence encoding a receptor.
 2. A retroviral vector derived from a lentivirus genome comprising a nucleic acid sequence capable of directing the expression of a receptor.
 3. A viral vector comprising a nucleic acid sequence encoding a retinoic acid receptor, preferably retinoic acid receptor β2 (RARβ2).
 4. A retroviral vector derived from a lentivirus genome comprising a nucleic acid sequence capable of directing the expression of a retinoic acid receptor, preferably retinoic acid receptor β2 (RARβ2).
 5. Use of a vector according to any preceding claim in the preparation of a medicament to cause neurite development.
 6. Use of a vector according to any preceding claim in the preparation of a medicament for the treatment of a neurological disorder.
 7. A method of treating a neurological disorder comprising administering a vector according to any preceding claim to a subject.
 8. A host cell when transduced by a vector according to any preceding claim.
 9. A pharmaceutical composition comprising a vector according to any preceding claim in admixture with a pharmaceutically acceptable carrier, diluent or excipient; wherein the pharmaceutical composition is for use to cause neurite development.
 10. A retroviral vector derived from a lentivirus genome comprising a nucleic acid sequence capable of directing the expression of at least part of RARβ2 and comprising a deleted gag gene wherein the deletion in gag removes one or more nucleotides downstream of nucleotide 350 of the gag coding sequence.
 11. A retroviral vector according to claim 10 wherein the deletion extends from nucleotide 350 to at least the C-terminus of the gag-pol coding region.
 12. A retroviral vector according to claim 10 or claim 11 wherein the deletion additionally removes nucleotide 300 of the gag coding region.
 13. A retroviral vector according to claim 10 wherein the deletion retains the first 150 nucleotides of the gag coding region.
 14. A retroviral vector according to claim 10 wherein the deletion retains the first 109 nucleotides of the gag coding region.
 15. A retroviral vector according to claim 10 wherein the deletion retains only the first 2 nucleotides of the gag coding region.
 16. A retroviral vector derived from a lentivirus genome wherein one or more accessory genes are absent from the lentivirus genome.
 17. A retroviral vector according to claim 16 wherein the accessory genes are selected from dUTPase, S2, rev and tat.
 18. A retroviral vector derived from a lentivirus genome wherein the lentivirus genome lacks the tat gene but includes the leader sequences between the end of the 5′ LTR and the ATG of gag.
 19. A retroviral vector according to any preceding claim which comprises at least one component from an equine lentivirus.
 20. A retroviral vector according to claim 19 wherein the equine lentivirus is EIAV.
 21. A retroviral vector according to claim 20 wherein the retroviral vector is substantially derived from EIAV.
 22. A method comprising transfecting or transducing a cell with a retroviral vector according to any one of claims 10 to
 21. 23. A delivery system in the form of a retroviral vector according to any one of claims 10 to
 21. 24. A cell transfected or transduced with a retroviral vector according to any one of claims 10 to
 21. 25. Use of a retroviral vector according to any one of claims 10 to
 21. 26. Use of a lentiviral gene therapy vector for the delivery of retinoic acid receptor β2 to a cell comprised by the peripheral or central nervous systems.
 27. A gene therapy vector comprising a nucleic acid sequence encoding a retinoic acid receptor β2.
 28. An EIAV gene therapy vector capable of directing the expression of retinoic acid receptor β2 (RARβ2).
 29. A method for producing expression of RARβ2 in an adult mammalian spinal cord cell comprising transducing or transfecting said cell with a vector according to any of claims 10 to 21 or any of claims 27 to
 28. 30. A method for stimulation of neurite outgrowth and/or regeneration in a mammalian neuronal cell comprising transducing or transfecting said cell with a vector according to any of claims 10 to 21 or any of claims 27 to
 28. 31. Use of a gene therapy vector according to claim 27 or claim
 28. 32. A differential expression screening method for identifying genes involved in a cellular process which method comprises comparing gene expression in: (a) a first cell of interest; and (b) a second cell of interest which cell comprises altered levels, relative to physiological levels, of a biological molecule due to the introduction into the second cell of a heterologous nucleic acid encoding at least part of RARβ2; and identifying gene products whose expression differs.
 33. Use of RARβ2 and/or an agonist thereof in the preparation of a medicament to cause neurite development.
 34. Use of RARβ2 and/or an agonist thereof according to claim 33, wherein said agonist is retinoic acid (RA) and/or CD2019.
 35. Use of RARβ2 and/or an agonist thereof in the preparation of a medicament for the treatment of a neurological disorder.
 36. Use of RARβ2 and/or an agonist thereof according to claim 35, wherein said neurological disorder comprises neurological injury.
 37. A method of treating a neurological disorder comprising administering a pharmacologically active amount of an RARβ2 receptor, and/or an agonist thereof.
 38. A method according to claim 37, wherein said agonist is RA and/or CD2019.
 39. A method according to claim 37 or claim 38, wherein said RARβ2 receptor is administered by an entity comprising a RARβ2 expression system.
 40. A method of causing neurite development in a subject, said method comprising providing a nucleic acid construct capable of directing the expression of at least part of a RARβ2 receptor, introducing said construct into one or more cells of said subject, and optionally administering a RARβ2 agonist, such as RA and/or CD2019, to said subject.
 41. A pharmaceutical composition comprising RARβ2 and/or an agonist thereof in admixture with a pharmaceutically acceptable carrier, diluent or excipient; wherein the pharmaceutical composition is for use to cause neurite development.
 42. Use of a receptor in the production of neurite outgrowth. 